Tag Archives: CDN

Automate Your Data Workflows With Backblaze B2 Event Notifications

2024-04-15 Bala Krishna Gangisetty

Post Syndicated from Bala Krishna Gangisetty original https://backblazeprod.wpenginepowered.com/blog/announcing-event-notifications/

Backblaze believes companies should be able to store, use, and protect their data in whatever way is best for their business—and that doing so should be easy. That’s why we’re such fierce advocates for the open cloud and why today’s announcement is so exciting.

Event Notifications—available today in private preview—gives businesses the freedom to build automated workloads across the different best-of-breed cloud platforms they use or want to use, saving time and money and improving end user experiences.

Here’s how: With Backblaze Event Notifications, any data changes within Backblaze B2 Cloud Storage—like uploads, updates, or deletions—can automatically trigger actions in a workflow, including transcoding video files, spooling up data analytics, delivering finished assets to end users, and many others. Importantly, unlike many other solutions currently available, Backblaze’s service doesn’t lock you into one platform or require you to use legacy tools from AWS.

So, to businesses that want to create an automated workflow that combines different compute, content delivery networks (CDN), data analytics, and whatever other cloud service: Now you can, with the bonus of cloud storage at a fifth of the rates of other solutions and free egress.

If you’re already a Backblaze customer, you can join the waiting list for the Event Notifications preview by signing up here. Once you’re admitted to the preview, the Event Notifications option will become visible in your Backblaze B2 account.

A screenshot of the where to find Event Notifications in your Backblaze account.

Not a Backblaze customer yet? Sign up for a free Backblaze B2 account and join the waitlist. Read on for more details on how Event Notifications can benefit you.

With Event Notifications, we can eliminate the final AWS component, Simple Queue Service (SQS), from our infrastructure. This completes our transition to a more streamlined and cost-effective tech stack. It’s not just about simplifying operations—it’s about achieving full independence from legacy systems and future-proofing our infrastructure.

— Oleh Aleynik, Senior Software Engineer and Co-Founder at CloudSpot.

A Deeper Dive on Backblaze’s Event Notifications Service

Event Notifications is a service designed to streamline and automate data workflows for Backblaze B2 customers. Whether it’s compressing objects, transcoding videos, or transforming data files, Event Notifications empowers you to orchestrate complex, multistep processes seamlessly.

The top line benefit of Event Notifications is its ability to trigger processing workflows automatically whenever data changes on Backblaze B2. This means that as soon as new data is uploaded, changed, or deleted, the relevant processing steps can be initiated without manual intervention. This automation not only saves time and resources, but it also ensures that workflows are consistently executed with precision, free from human errors.

What sets Event Notifications apart is its flexibility. Unlike some other solutions that are tied to specific target services, Event Notifications allows customers the freedom to choose the target services that best suit their needs. Whether it’s integrating with third-party applications, cloud services, or internal systems, Event Notifications seamlessly integrates into existing workflows, offering unparalleled versatility.

Finally, Event Notifications doesn’t only bring greater ease and efficiency to workflows, it is also designed for very easy enablement. Whether via browser UI or SDKs or APIs or CLI, it is incredibly simple to set up a notification rule and integrate it with your preferred target service. Simply choose your event type, set the criteria, and input your endpoint URL, and a new workflow can be configured in minutes.

What Is Backblaze B2 Event Notifications Good For?

By leveraging Event Notifications, Backblaze B2 customers can simplify their data processing pipelines, reduce manual effort, and increase operational efficiency. With the ability to automate repetitive tasks and handle millions of objects per day, businesses can focus on extracting insights from their data rather than managing the logistics of data processing.

A diagram showing the steps of event notifications.

Automating tasks: Event Notifications allows users to trigger automated actions in response to changes in stored objects like upload, delete, and hide actions, streamlining complex data processing tasks.

Orchestrating workflows: Users can orchestrate multi-step workflows, such as compressing files, transcoding videos, or transforming data formats, based on specific object events.

Integrating with services: The feature offers flexible integration capabilities, enabling seamless interaction with various services and tools to enhance data processing and management.

Monitoring changes: Users can efficiently monitor and track changes to stored objects, ensuring timely responses to evolving data requirements and faster security response to safeguard critical assets.

What Are Some of the Key Capabilities of Backblaze B2 Event Notifications?

Flexible Implementation: Event Notifications are sent as HTTP POST requests to the desired service or endpoint within your infrastructure or any other cloud service. This flexibility ensures seamless integration with your existing workflows. For instance, your endpoint could be Fastly Compute, AWS Lambda, Azure Functions, or Google Cloud Functions, etc.

Event Categories: Specify the types of events you want to be notified about, such as when files are uploaded and deleted. This allows you to receive notifications tailored to your specific needs. For instance, you have the flexibility to specify different methods of object creation, such as copying, uploading, or multipart replication, to trigger event notifications. You can also manage Event Notification rules through UI or API.

Filter by Prefix: Define prefixes to filter events, enabling you to narrow down notifications to specific sets of objects or directories within your storage on Backblaze B2. For instance, if your bucket contains audio, video, and text files organized into separate prefixes, you can specify the prefix for audio files to receive event notifications exclusively for audio files.

Custom Headers: Include personalized HTTP headers in your event notifications to provide additional authentication or contextual information when communicating with your target endpoint. For example, you can use these headers to add necessary authentication tokens or API keys for your target endpoint, or include any extra metadata related to the payload to offer contextual information to your webhook endpoint, and more.

Signed Notification Messages: You can configure outgoing messages to be signed by the Event Notifications service, allowing you to validate signatures and verify that each message was generated by Backblaze B2 and not tampered with in transit.

Test Rule Functionality: Validate the functionality of your target endpoint by testing event notifications before deploying them into action. This allows you to ensure that your integration with your target endpoint is set up correctly and functioning as expected.

Want to Learn More About Event Notifications?

Join our waiting list by signing up here.
Attend our upcoming webinar.
For Quick-Start, User Guide, and API documentation, head over to the Event Notification section of our documentation.

Event Notifications represents a significant advancement in data management and automation for Backblaze B2 users. By providing a flexible and powerful capability for orchestrating data processing workflows, Backblaze continues to empower businesses to unlock the full potential of their data with ease and efficiency.

The post Automate Your Data Workflows With Backblaze B2 Event Notifications appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Managing the Media Tidal Wave: Backlight iconik’s 2024 Media Report

2024-03-28 Jennifer Newman

Post Syndicated from Jennifer Newman original https://backblazeprod.wpenginepowered.com/blog/managing-the-media-tidal-wave-backlight-iconiks-2024-media-report/

Everyone knows we’re living through an exceptional time when it comes to media production: Every day we experience a tidal wave of content—social video, virtual reality (VR) and augmented reality (AR) gaming, 10K sports footage, every streaming option imaginable—crashing down on us.

Two eye popping stats underscore that this perception is real: In December, Netflix shared that its users streamed close to 100 billion hours of content on its platform during the first half of 2023. At the beginning of 2024, YouTube revealed that its users watch one billion hours of video daily.

It’s hard to make sense of that volume of content, it’s even harder to understand how it’s produced. Imagine the armies of people and types of programs required to capture, ingest, transcode, store, tag, edit, distribute, and archive all of it. Managing that content means touching every stage, start to finish, of the production process through the data lifecycle.

A photo showing a woman surrounded by glistening raindrops, some of which transform into media icons.

To further complicate things, the modern production person’s workflow is nothing close to linear. They have to deal with:

A diversity of inputs: Content is pouring in from various sources. Keeping track of all this content, ensuring its quality, and organizing it effectively can feel like trying to catch a waterfall in a teacup.
A variety of formats and file types: Each platform or device may require a different format or resolution, making it difficult to maintain consistency across the board and adding another layer of complexity.
Managing metadata, or indexing and tagging: With so much content flying around, ensuring that each file is properly tagged with relevant information is crucial for easy retrieval and management. However, manually inputting metadata for every piece of media can be time-consuming and prone to errors.
Remote and in-person collaboration: With team members spread across different locations, coordinating efforts and maintaining version control can be a headache.
Storage and scalability: As the volume of media grows, so does the need for storage space. Finding a solution that can accommodate this growth without breaking the bank can be tough.

While many companies are jumping in to provide tools to manage this tidal wave of content, one company, Backlight iconik, differentiates itself by providing industry-leading tools and offering public media reports on the state of media data today.

Backlight iconik’s 2024 Media Stats Report

Since 2018, iconik has provided a cloud-based media asset management (MAM) tool to help production professionals tame the insanity of modern content development. For the past four years, they’ve also provided an annual Media Stats report to help the industry understand the type of media being developed and distributed, as well as where and how it’s stored. (In 2022, Backlight, a global media technology, acquired iconik, hence the name change.) If you want the full story, please check out Backlight iconik’s 2024 Media Stats Report,

As cloud storage specialists here at Backblaze (and, lovers of stats ourselves), we would like to dig into their stats on storage and offer our own take here for you today.

2024 Media Storage Trend: The Content Tidal Wave Is Not a Mirage

According to Backlight iconik’s Media Stats Report, iconik’s data exploded to 152PB, shooting up by a whopping 57%—that’s 53PB more than in 2023. To put it in perspective, that’s roughly 6TB of fresh data pouring in every hour. This surge in data can be attributed to both new customers integrating their archives with iconik and existing customers ramping up their usage.

2024 Media Storage Trend: Audio on the Rise

An interesting find in their study was the difference between audio and video asset growth. Iconik is now managing 328 years of video (up 41% YoY) and 208 years of audio (up 50% YoY).

A decorative display of media stats from iconik's media report. The title reads Asset Duration. On the left, a blue circle displays the words 328 years of video, which represents 41% growth year on year. On the right, an orange circle contains the words 208 years of audio, which represents 50% growth year on year. — Note that “duration” in this context is measuring the total hours of runtime for each file. Source.

Over the last year, the growth of audio assets managed by Backlight iconik has surged, surpassing that of video, with a staggering 1,700 hours being added daily. They believe this surge is closely tied to the remarkable expansion of both the podcasting and audiobook markets in recent years. The global podcast market ballooned to $17.9 billion in 2023 and is forecasted to soar to $144 billion by 2032. Similarly, the audiobook market is projected to hit $35 billion by 2030, with expected revenue of $35.05 billion in the same year. While audio files are smaller than video files by far, it’s reasonable to anticipate a continued upward trajectory for audio assets across the media and entertainment landscape.

2024 Trend: The Shift to Cloud Storage for Cost-Effective Storage and Collaboration

According to Backlight iconik’s 2024 Media Stats Report, the trend toward cloud storage is definitely on the rise as the increased competition in the market and move away from hyperscalers drive more reasonable pricing. Companies are opting to transition to the cloud at their own speed, and hybrid cloud setups give them the freedom to shift assets as needed to improve things like performance, ease of access, security, and meeting regulatory requirements.

A pie chart comparing the hybrid cloud split in the media industry. There is 65% cloud-based storage and 35% on-premises storage. — Source.

Get Ahead of the Wave: Pair iconik With Modern Cloud Storage Today

The reasons so many media professionals are moving to cloud are relatively simple: Cloud workflows enable enhanced collaboration and flexibility, greater cost predictability, and heightened security and management capabilities. And often, all of the above is possible at a lower total cost than legacy solutions.

Pairing Backlight iconik and Backblaze provides a simple solution for users to manage, collaborate, and store media projects easily. By integrating with iconik, Backblaze boosts workflow effectiveness, delivering a strong cloud-based MAM system that allows thorough management of Backblaze B2 Cloud Storage data right from a web browser.

Customer Story: How One Streaming Company Tackled Remote Content Workflows With Backblaze and Backlight iconik

When Goalcast, the empowering media company, decided to dive into making their own content, they realized their current setup just wasn’t cutting it. With their team spread out all over the place, they needed an easy way to get footage, access videos from anywhere, and keep a stash of finished files ready to jazz up for YouTube, Facebook, Instagram, Snapchat, TikTok, and the Goalcast OTT app.

Goalcast combined LucidLink’s cloud-based workflows, iconik’s media asset manager and uploader features, and Backblaze B2 Cloud Storage. The integration between iconik, LucidLink, and Backblaze creates a slick media workflow. The content crew uploads raw footage straight into iconik, tossing in key details. Original files zip into Goalcast’s Backblaze B2 Bucket automatically, while edited versions are up for grabs via LucidLink. After the editing magic, final assets kick back into Backblaze B2.

The integration and partnership means endless possibilities for Goalcast. They’re saving around 150 hours a month in grunt work and stress. Now, they don’t have to fret about where footage hides or how to snag it—it’s all securely stored in Backblaze, ready for anyone on the team to grab, no matter where they’re working from.

You can get lost in the weeds of tech companies and storage solutions. It can hurt your brain. The sweet spot is these three—iconik, LucidLink, and Backblaze—and how they work together.

—Dan Schiffmacher, Production Operations Manager, Goalcast

2024 Media Mega Trend

Looking at Backlight iconik’s numbers and forecasts from a 25,000 foot vantage point makes one thing painfully clear: Effective media management and storage are going to be absolutely crucial for media teams to succeed in the future landscape of production. Dive deeper into how Backblaze and Backlight iconik can support you now and down the road, ensuring seamless media management, and affordable storage, that creates easy, stress-free expansion as your data continues to grow.

Already have iconik and want to get started with Backblaze? Click here.

The post Managing the Media Tidal Wave: Backlight iconik’s 2024 Media Report appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Navigating Cloud Storage: What is Latency and Why Does It Matter?

2024-02-27 Amrit Singh

Post Syndicated from Amrit Singh original https://www.backblaze.com/blog/navigating-cloud-storage-what-is-latency-and-why-does-it-matter/

A decorative image showing a computer and a server arrows moving between them, and a stopwatch indicating time.

In today’s bandwidth-intensive world, latency is an important factor that can impact performance and the end-user experience for modern cloud-based applications. For many CTOs, architects, and decision-makers at growing small and medium sized businesses (SMBs), understanding and reducing latency is not just a technical need but also a strategic play.

Latency, or the time it takes for data to travel from one point to another, affects everything from how snappy or responsive your application may feel to content delivery speeds to media streaming. As infrastructure increasingly relies on cloud object storage to manage terabytes or even petabytes of data, optimizing latency can be the difference between success and failure.

Let’s get into the nuances of latency and its impact on cloud storage performance.

Upload vs. Download Latency: What’s the Difference?

In the world of cloud storage, you’ll typically encounter two forms of latency: upload latency and download latency. Each can impact the responsiveness and efficiency of your cloud-based application.

Upload Latency

Upload latency refers to the delay when data is sent from a client or user’s device to the cloud. Live streaming applications, backup solutions, or any application that relies heavily on real-time data uploading will experience hiccups if upload latency is high, leading to buffering delays or momentary stream interruptions.

Download Latency

Download latency, on the other hand, is the delay when retrieving data from the cloud to the client or end user’s device. Download latency is particularly relevant for content delivery applications, such as on demand video streaming platforms, e-commerce, or other web-based applications. Reducing download latency, creating a snappy web experience, and ensuring content is swiftly delivered to the end user will make for a more favorable user experience.

Ideally, you’ll want to optimize for latency in both directions, but, depending on your use case and the type of application you are building, it’s important to understand the nuances of upload and download latency and their impact on your end users.

Decoding Cloud Latency: Key Factors and Their Impact

When it comes to cloud storage, how good or bad the latency is can be influenced by a number of factors, each having an impact on the overall performance of your application. Let’s explore a few of these key factors.

Network Congestion

Like traffic on a freeway, packets of data can experience congestion on the internet. This can lead to slower data transmission speeds, especially during peak hours, leading to a laggy experience. Internet connection quality and the capacity of networks can also contribute to this congestion.

Geographical Distance

Often overlooked, the physical distance from the client or end user’s device to the cloud origin store can have an impact on latency. The farther the distance from the client to the server, the farther the data has to traverse and the longer it takes for transmission to complete, leading to higher latency.

Infrastructure Components

The quality of infrastructure, including routers, switches, and cables, may affect network performance and latency numbers. Modern hardware, such as fiber-optic cables, can reduce latency, unlike outdated systems that don’t meet current demands. Often, you don’t have full control over all of these infrastructure elements, but awareness of potential bottlenecks may be helpful, guiding upgrades wherever possible.

Technical Processes

TCP/IP Handshake: Connecting a client and a server involves a handshake process, which may introduce a delay, especially if it’s a new connection.
DNS Resolution: Latency can be increased by the time it takes to resolve a domain name to its IP address. There is a small reduction in total latency with faster DNS resolution times.
Data routing: Data does not necessarily travel a straight line from its source to its destination. Latency can be influenced by the effectiveness of routing algorithms and the number of hops that data must make.

Reduced latency and improved application performance are important for businesses that rely on frequently accessing data stored in cloud storage. This may include selecting providers with strategically positioned data centers, fine-tuning network configurations, and understanding how internet infrastructure affects the latency of their applications.

Minimizing Latency With Content Delivery Networks (CDNs)

Further reducing latency in your application may be achieved by layering a content delivery network (CDN) in front of your origin storage. CDNs help reduce the time it takes for content to reach the end user by caching data in distributed servers that store content across multiple geographic locations. When your end-user requests or downloads content, the CDN delivers it from the nearest server, minimizing the distance the data has to travel, which significantly reduces latency.

Backblaze B2 Cloud Storage integrates with multiple CDN solutions, including Fastly, Bunny.net, and Cloudflare, providing a performance advantage. And, Backblaze offers the additional benefit of free egress between where the data is stored and the CDN’s edge servers. This not only reduces latency, but also optimizes bandwidth usage, making it cost effective for businesses building bandwidth intensive applications such as on demand media streaming.

To get slightly into the technical weeds, CDNs essentially cache content at the edge of the network, meaning that once content is stored on a CDN server, subsequent requests do not need to go back to the origin server to request data.

A diagram showing how CDNs use edge servers to improve latency. The origin server is the central node, then the CDN caches the requested data at an edge server closer to the end user. — Source.

This reduces the load on the origin server and reduces the time needed to deliver the content to the user. For companies using cloud storage, integrating CDNs into their infrastructure is an effective configuration to improve the global availability of content, making it an important aspect of cloud storage and application performance optimization.

Case Study: Musify Improves Latency and Reduces Cloud Bill by 70%

To illustrate the impact of reduced latency on performance, consider the example of music streaming platform Musify. By moving from Amazon S3 to Backblaze B2 and leveraging the partnership with Cloudflare, Musify significantly improved its service offering. Musify egresses about 1PB of data per month, which, under traditional cloud storage pricing models, can lead to significant costs. Because Backblaze and Cloudflare are both members of the Bandwidth Alliance, Musify now has no data transfer costs, contributing to an estimated 70% reduction in cloud spend. And, thanks to the high cache hit ratio, 90% of the transfer takes place in the CDN layer, which helps maintain high performance, regardless of the location of the file or the user.

Latency Wrap Up

As we wrap up our look at the role latency plays in cloud-based applications, it’s clear that understanding and strategically reducing latency is a necessary approach for CTOs, architects, and decision-makers building many of the modern applications we all use today. There are several factors that impact upload and download latency, and it’s important to understand the nuances to effectively improve performance.

Additionally, Backblaze B2’s integrations with CDNs like Fastly, bunny.net, and Cloudflare offer a cost-effective way to improve performance and reduce latency. The strategic decisions Musify made demonstrate how reducing latency with a CDN can significantly improve content delivery while saving on egress costs, and reducing overall business OpEx.

For additional information and guidance on reducing latency, improving TTFB numbers and overall performance, the insights shared in “Cloud Performance and When It Matters” offer a deeper, technical look.

If you’re keen to explore further into how an object storage platform may support your needs and help scale your bandwidth-intensive applications, read more about Backblaze B2 Cloud Storage.

The post Navigating Cloud Storage: What is Latency and Why Does It Matter? appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Cloud 101: Data Egress Fees Explained

2023-11-30 Molly Clancy

Post Syndicated from Molly Clancy original https://www.backblaze.com/blog/cloud-101-data-egress-fees-explained/

A decorative article showing a server, a cloud, and arrows pointing up and down with a dollar sign.

You can imagine data egress fees like tolls on a highway—your data is cruising along trying to get to its destination, but it has to pay a fee for the privilege of continuing its journey. If you have a lot of data to move, or a lot of toll booths (different cloud services) to move it through, those fees can add up quickly.

Data egress fees are charges you incur for moving data out of a cloud service. They can be a big part of your cloud bill depending on how you use the cloud. And, they’re frequently a reason behind surprise AWS bills. So, let’s take a closer look at egress, egress fees, and ways you can reduce or eliminate them, so that your data can travel the cloud superhighways at will.

What Is Data Egress?

In computing generally, data egress refers to the transfer or movement of data out of a given location, typically from within a network or system to an external destination. When it comes to cloud computing, egress generally means whenever data leaves the boundaries of a cloud provider’s network.

In the simplest terms, data egress is the outbound flow of data.

A photo of a stair case with a sign that says "out" and an arrow pointing up. — The fees, like these stairs, climb higher. Source.

Egress vs. Ingress?

While egress pertains to data exiting a system, ingress refers to data entering a system. When you download something, you’re egressing data from a service. When you upload something, you’re ingressing data to that service.

Unsurprisingly, most cloud storage providers do not charge you to ingress data—they want you to store your data on their platform, so why would they?

Egress vs. Download

You might hear egress referred to as download, and that’s not wrong, but there are some nuances. Egress applies not only to downloads, but also when you migrate data between cloud services, for example. So, egress includes downloads, but it’s not limited to them.

In the context of cloud service providers, the distinction between egress and download may not always be explicitly stated, and the terminology used can vary between providers. It’s essential to refer to the specific terms and pricing details provided by the service or platform you are using to understand how they classify and charge for data transfers.

How Do Egress Fees Work?

Data egress fees are charges incurred when data is transferred out of a cloud provider’s environment. These fees are often associated with cloud computing services, where users pay not only for the resources they consume within the cloud (such as storage and compute) but also for the data that is transferred from the cloud to external destinations.

There are a number of scenarios where a cloud provider typically charges egress:

When you’re migrating data from one cloud to another.
When you’re downloading data from a cloud to a local repository.
When you move data between regions or zones with certain cloud providers.
When an application, end user, or content delivery network (CDN) requests data from your cloud storage bucket.

The fees can vary depending on the amount of data transferred and the destination of the data. For example, transferring data between regions within the same cloud provider’s network might incur lower fees than transferring data to the internet or to a different cloud provider.

Data egress fees are an important consideration for organizations using cloud services, and they can impact the overall cost of hosting and managing data in the cloud. It’s important to be aware of the pricing details related to data egress in the cloud provider’s pricing documentation, as these fees can contribute significantly to the total cost of using cloud services.

Why Do Cloud Providers Charge Egress Fees?

Both ingressing and egressing data costs cloud providers money. They have to build the physical infrastructure to allow users to do that, including switches, routers, fiber cables, etc. They also have to have enough of that infrastructure on hand to meet customer demand, not to mention staff to deploy and maintain it.

However, it’s telling that most cloud providers don’t charge ingress fees, only egress fees. It would be hard to entice people to use your service if you charged them extra for uploading their data. But, once cloud providers have your data, they want you to keep it there. Charging you to remove it is one way cloud providers like AWS, Google Cloud, and Microsoft Azure do that.

What Are AWS’s Egress Fees?

AWS S3 gives customers 100GB of data transfer out to the internet free each month, with some caveats—that 100GB excludes data stored in China and GovCloud. After that, the published rates for U.S. regions for data transferred over the public internet are as follows as of the date of publication:

The first 10TB per month is $0.09 per GB.
The next 40TB per month is $0.085 per GB.
The next 100TB per month is $0.07 per GB.
Anything greater than 150TB per month is $0.05 per GB.

But AWS also charges customers egress between certain services and regions, and it can get complicated quickly as the following diagram shows…

How Can I Reduce Egress Fees?

If you’re using cloud services, minimizing your egress fees is probably a high priority. Companies like the Duckbill Group (the creators of the diagram above) exist to help businesses manage their AWS bills. In fact, there’s a whole industry of consultants that focuses solely on reducing your AWS bills.

Aside from hiring a consultant to help you spend less, there are a few simple ways to lower your egress fees:

Use a content delivery network (CDN): If you’re hosting an application, using a CDN can lower your egress fees since a CDN will cache data on edge servers. That way, when a user sends a request for your data, it can pull it from the CDN server rather than your cloud storage provider where you would be charged egress.
Optimize data transfer protocols: Choose efficient data transfer protocols that minimize the amount of data transmitted. For example, consider using compression or delta encoding techniques to reduce the size of transferred files. Compressing data before transfer can reduce the volume of data sent over the network, leading to lower egress costs. However, the effectiveness of compression depends on the nature of the data.
Utilize integrated cloud providers: Some cloud providers offer free data transfer with a range of other cloud partners. (Hint: that’s what we do here at Backblaze!)
Be aware of tiering: It may sound enticing to opt for a cold(er) storage tier to save on storage, but some of those tiers come with much higher egress fees.

How Does Backblaze Reduce Egress Fees?

There’s one more way you can drastically reduce egress, and we’ll just come right out and say it: Backblaze gives you free egress up to 3x the average monthly storage and unlimited free egress through a number of CDN and compute partners, including Fastly, Cloudflare, Bunny.net, and Vultr.

Why do we offer free egress? Supporting an open cloud environment is central to our mission, so we expanded free egress to all customers so they can move data when and where they prefer. Cloud providers like AWS and others charge high egress fees that make it expensive for customers to use multi-cloud infrastructures and therefore lock in customers to their services. These walled gardens hamper innovation and long-term growth.

Free Egress = A Better, Multi-Cloud World

The bottom line: the high egress fees charged by hyperscalers like AWS, Google, and Microsoft are a direct impediment to a multi-cloud future driven by customer choice and industry need. And, a multi-cloud future is something we believe in. So go forth and build the multi-cloud future of your dreams, and leave worries about high egress fees in the past.

The post Cloud 101: Data Egress Fees Explained appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Cloud Storage Performance: The Metrics that Matter

2023-10-24 Kari Rivas

Post Syndicated from Kari Rivas original https://www.backblaze.com/blog/cloud-storage-performance-the-metrics-that-matter/

A decorative image showing a cloud in the foreground and various mocked up graphs in the background.

Availability, time to first byte, throughput, durability—there are plenty of ways to measure “performance” when it comes to cloud storage. But, which measure is best and how should performance factor in when you’re choosing a cloud storage provider? Other than security and cost, performance is arguably the most important decision criteria, but it’s also the hardest dimension to clarify. It can be highly variable and depends on your own infrastructure, your workload, and all the network connections between your infrastructure and the cloud provider as well.

Today, I’m walking through how to think strategically about cloud storage performance, including which metrics matter and which may not be as important for you.

First, What’s Your Use Case?

The first thing to keep in mind is how you’re going to be using cloud storage. After all, performance requirements will vary from one use case to another. For instance, you may need greater performance in terms of latency if you’re using cloud storage to serve up software as a service (SaaS) content; however, if you’re using cloud storage to back up and archive data, throughput is probably more important for your purposes.

For something like application storage, you should also have other tools in your toolbox even when you are using hot, fast, public cloud storage, like the ability to cache content on edge servers, closer to end users, with a content delivery network (CDN).

Ultimately, you need to decide which cloud storage metrics are the most important to your organization. Performance is important, certainly, but security or cost may be weighted more heavily in your decision matrix.

A decorative image showing several icons representing different types of files on a grid over a cloud.

What Is Performant Cloud Storage?

Performance can be described using a number of different criteria, including:

Latency
Throughput
Availability
Durability

I’ll define each of these and talk a bit about what each means when you’re evaluating a given cloud storage provider and how they may affect upload and download speeds.

Latency

Latency is defined as the time between a client request and a server response. It quantifies the time it takes data to transfer across a network.
Latency is primarily influenced by physical distance—the farther away the client is from the server, the longer it takes to complete the request.
If you’re serving content to many geographically dispersed clients, you can use a CDN to reduce the latency they experience.

Latency can be influenced by network congestion, security protocols on a network, or network infrastructure, but the primary cause is generally distance, as we noted above.

Downstream latency is typically measured using time to first byte (TTFB). In the context of surfing the web, TTFB is the time between a page request and when the browser receives the first byte of information from the server. In other words, TTFB is measured by how long it takes between the start of the request and the start of the response, including DNS lookup and establishing the connection using a TCP handshake and TLS handshake if you’ve made the request over HTTPS.

Let’s say you’re uploading data from California to a cloud storage data center in Sacramento. In that case, you’ll experience lower latency than if your business data is stored in, say, Ohio and has to make the cross-country trip. However, making the “right” decision about where to store your data isn’t quite as simple as that, and the complexity goes back to your use case. If you’re using cloud storage for off-site backup, you may want your data to be stored farther away from your organization to protect against natural disasters. In this case, performance is likely secondary to location—you only need fast enough performance to meet your backup schedule.

Using a CDN to Improve Latency

If you’re using cloud storage to store active data, you can speed up performance by using a CDN. A CDN helps speed content delivery by caching content at the edge, meaning faster load times and reduced latency.

Edge networks create “satellite servers” that are separate from your central data server, and CDNs leverage these to chart the fastest data delivery path to end users.

Throughput

Throughput is a measure of the amount of data passing through a system at a given time.
If you have spare bandwidth, you can use multi-threading to improve throughput.
Cloud storage providers’ architecture influences throughput, as do their policies around slowdowns (i.e. throttling).

Throughput is often confused with bandwidth. The two concepts are closely related, but different.

To explain them, it’s helpful to use a metaphor: Imagine a swimming pool. The amount of water in it is your file size. When you want to drain the pool, you need a pipe. Bandwidth is the size of the pipe, and throughput is the rate at which water moves through the pipe successfully. So, bandwidth affects your ultimate throughput. Throughput is also influenced by processing power, packet loss, and network topology, but bandwidth is the main factor.

A diagram showing the pipe metaphor that describes the differences between bandwidth and throughput. — Source.

Using Multi-Threading to Improve Throughput

Assuming you have some bandwidth to spare, one of the best ways to improve throughput is to enable multi-threading. Threads are units of execution within processes. When you transmit files using a program across a network, they are being communicated by threads. Using more than one thread (multi-threading) to transmit files is, not surprisingly, better and faster than using just one (although a greater number of threads will require more processing power and memory). To return to our water pipe analogy, multi-threading is like having multiple water pumps (threads) running to that same pipe. Maybe with one pump, you can only fill 10% of your pipe. But you can keep adding pumps until you reach pipe capacity.

When you’re using cloud storage with an integration like backup software or a network attached storage (NAS) device, the multi-threading setting is typically found in the integration’s settings. Many backup tools, like Veeam, are already set to multi-thread by default. Veeam automatically makes adjustments based on details like the number of individual backup jobs, or you can configure the number of threads manually. Other integrations, like Synology’s Cloud Sync, also give you granular control over threading so you can dial in your performance.

A diagram showing single vs. multi-threaded processes. — Still confused about threads? Learn more in our deep dive, including what’s going on in this diagram.

That said, the gains from increasing the number of threads are limited by the available bandwidth, processing power, and memory. Finding the right setting can involve some trial and error, but the improvements can be substantial (as we discovered when we compared download speeds on different Python versions using single vs. multi-threading).

What About Throttling?

One question you’ll absolutely want to ask when you’re choosing a cloud storage provider is whether they throttle traffic. That means they deliberately slow down your connection for various reasons. Shameless plug here: Backblaze does not throttle, so customers are able to take advantage of all their bandwidth while uploading to B2 Cloud Storage. Amazon and many other public cloud services do throttle.

Upload Speed and Download Speed

Your ultimate upload and download speeds will be affected by throughput and latency. Again, it’s important to consider your use case when determining which performance measure is most important for you. Latency is important to application storage use cases where things like how fast a website loads can make or break a potential SaaS customer. With latency being primarily influenced by distance, it can be further optimized with the help of a CDN. Throughput is often the measurement that’s more important to backup and archive customers because it is indicative of the upload and download speeds an end user will experience, and it can be influenced by cloud storage provider practices, like throttling.

Availability

Availability is the percentage of time a cloud service or a resource is functioning correctly.
Make sure the availability listed in the cloud provider’s service level agreement (SLA) matches your needs.
Keep in mind the difference between hot and cold storage—cold storage services like Amazon Glacier offer slower retrieval and response times.

Also called uptime, this metric measures the percentage of time that a cloud service or resource is available and functioning correctly. It’s usually expressed as a percentage, with 99.9% (three nines) or 99.99% (four nines) availability being common targets for critical services. Availability is often backed by SLAs that define the uptime customers can expect and what happens if availability falls below that metric.

You’ll also want to consider availability if you’re considering whether you want to store in cold storage versus hot storage. Cold storage is lower performing by design. It prioritizes durability and cost-effectiveness over availability. Services like Amazon Glacier and Google Coldline take this approach, offering slower retrieval and response times than their hot storage counterparts. While cost savings is typically a big factor when it comes to considering cold storage, keep in mind that if you do need to retrieve your data, it will take much longer (potentially days instead of seconds), and speeding that up at all is still going to cost you. You may end up paying more to get your data back faster, and you should also be aware of the exorbitant egress fees and minimum storage duration requirements for cold storage—unexpected costs that can easily add up.

	Cold	Hot
Access Speed	Slow	Fast
Access Frequency	Seldom or Never	Frequent
Data Volume	Low	High
Storage Media	Slower drives, LTO, offline	Faster drives, durable drives, SSDs
Cost	Lower	Higher

Durability

Durability is the ability of a storage system to consistently preserve data.
Durability is measured in “nines” or the probability that your data is retrievable after one year of storage.
We designed the Backblaze B2 Storage Cloud for 11 nines of durability using erasure coding.

Data durability refers to the ability of a data storage system to reliably and consistently preserve data over time, even in the face of hardware failures, errors, or unforeseen issues. It is a measure of data’s long-term resilience and permanence. Highly durable data storage systems ensure that data remains intact and accessible, meeting reliability and availability expectations, making it a fundamental consideration for critical applications and data management.

We usually measure durability or, more precisely annual durability, in “nines”, referring to the number of nines in the probability (expressed as a percentage) that your data is retrievable after one year of storage. We know from our work on Drive Stats that an annual failure rate of 1% is typical for a hard drive. So, if you were to store your data on a single drive, its durability, the probability that it would not fail, would be 99%, or two nines.

The very simplest way of improving durability is to simply replicate data across multiple drives. If a file is lost, you still have the remaining copies. It’s also simple to calculate the durability with this approach. If you write each file to two drives, you lose data only if both drives fail. We calculate the probability of both drives failing by multiplying the probabilities of either drive failing, 0.01 x 0.01 = 0.0001, giving a durability of 99.99%, or four nines. While simple, this approach is costly—it incurs a 100% overhead in the amount of storage required to deliver four nines of durability.

Erasure coding is a more sophisticated technique, improving durability with much less overhead than simple replication. An erasure code takes a “message,” such as a data file, and makes a longer message in a way that the original can be reconstructed from the longer message even if parts of the longer message have been lost.

A decorative image showing the matrices that get multiplied to allow Reed-Solomon code to re-create files. — A representation of Reed-Solomon erasure coding, with some very cool Storage Pods in the background.

The durability calculation for this approach is much more complex than for replication, as it involves the time required to replace and rebuild failed drives as well as the probability that a drive will fail, but we calculated that we could take advantage of erasure coding in designing the Backblaze B2 Storage Cloud for 11 nines of durability with just 25% overhead in the amount of storage required.

How does this work? Briefly, when we store a file, we split it into 16 equal-sized pieces, or shards. We then calculate four more shards, called parity shards, in such a way that the original file can be reconstructed from any 16 of the 20 shards. We then store the resulting 20 shards on 20 different drives, each in a separate Storage Pod (storage server).

Note: As hard disk drive capacity increases, so does the time required to recover after a drive failure, so we periodically adjust the ratio between data shards and parity shards to maintain our eleven nines durability target. Consequently, our very newest vaults use a 15+5 scheme.

If a drive does fail, it can be replaced with a new drive, and its data rebuilt from the remaining good drives. We open sourced our implementation of Reed-Solomon erasure coding, so you can dive into the source code for more details.

Additional Factors Impacting Cloud Storage Performance

In addition to bandwidth and latency, there are a few additional factors that impact cloud storage performance, including:

The size of your files.
The number of files you upload or download.
Block (part) size.
The amount of available memory on your machine.

Small files—that is, those less than 5GB—can be uploaded in a single API call. Larger files, from 5MB to 10TB, can be uploaded as “parts”, in multiple API calls. You’ll notice that there is quite an overlap here! For uploading files between 5MB and 5GB, is it better to upload them in a single API call, or split them into parts? What is the optimum part size? For backup applications, which typically split all data into equal-sized blocks, storing each block as a file, what is the optimum block size? As with many questions, the answer is that it depends.

Remember latency? Each API call incurs a more-or-less fixed overhead due to latency. For a 1GB file, assuming a single thread of execution, uploading all 1GB in a single API call will be faster than ten API calls each uploading a 100MB part, since those additional nine API calls each incur some latency overhead. So, bigger is better, right?

Not necessarily. Multi-threading, as mentioned above, affords us the opportunity to upload multiple parts simultaneously, which improves performance—but there are trade-offs. Typically, each part must be stored in memory as it is uploaded, so more threads means more memory consumption. If the number of threads multiplied by the part size exceeds available memory, then either the application will fail with an out of memory error, or data will be swapped to disk, reducing performance.

Downloading data offers even more flexibility, since applications can specify any portion of the file to download in each API call. Whether uploading or downloading, there is a maximum number of threads that will drive throughput to consume all of the available bandwidth. Exceeding this maximum will consume more memory, but provide no performance benefit. If you go back to our pipe analogy, you’ll have reached the maximum capacity of the pipe, so adding more pumps won’t make things move faster.

So, what to do to get the best performance possible for your use case? Simple: customize your settings.

Most backup and file transfer tools allow you to configure the number of threads and the amount of data to be transferred per API call, whether that’s block size or part size. If you are writing your own application, you should allow for these parameters to be configured. When it comes to deployment, some experimentation may be required to achieve maximum throughput given available memory.

How to Evaluate Cloud Performance

To sum up, the cloud is increasingly becoming a cornerstone of every company’s tech stack. Gartner predicts that by 2026, 75% of organizations will adopt a digital transformation model predicated on cloud as the fundamental underlying platform. So, cloud storage performance will likely be a consideration for your company in the next few years if it isn’t already.

It’s important to consider that cloud storage performance can be highly subjective and heavily influenced by things like use case considerations (i.e. backup and archive versus application storage, media workflow, or another), end user bandwidth and throughput, file size, block size, etc. Any evaluation of cloud performance should take these factors into account rather than simply relying on metrics in isolation. And, a holistic cloud strategy will likely have multiple operational schemas to optimize resources for different use cases.

Wait, Aren’t You, Backblaze, a Cloud Storage Company?

Why, yes. Thank you for noticing. We ARE a cloud storage company, and we OFTEN get questions about all of the topics above. In fact, that’s why we put this guide together—our customers and prospects are the best sources of content ideas we can think of. Circling back to the beginning, it bears repeating that performance is one factor to consider in addition to security and cost. (And, hey, we would be remiss not to mention that we’re also one-fifth the cost of AWS S3.) Ultimately, whether you choose Backblaze B2 Cloud Storage or not though, we hope the information is useful to you. Let us know if there’s anything we missed.

The post Cloud Storage Performance: The Metrics that Matter appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

The Power of Specialized Cloud Providers: A Game Changer for SaaS Companies

2023-06-13 Amrit Singh

Post Syndicated from Amrit Singh original https://www.backblaze.com/blog/the-power-of-specialized-cloud-providers-a-game-changer-for-saas-companies/

“Nobody ever got fired for buying AWS.” It’s true: AWS’s one-size-fits-all solution worked great for most businesses, and those businesses made the shift away from the traditional model of on-prem and self-hosted servers—what we think of as Cloud 1.0—to an era where AWS was the cloud, the one and only, which is what we call Cloud 2.0. However, as the cloud landscape evolves, it’s time to question the old ways. Maybe nobody ever got fired for buying AWS, but these days, you can certainly get a lot of value (and kudos) for exploring other options.

Developers and IT teams might hesitate when it comes to moving away from AWS, but AWS comes with risks, too. If you don’t have the resources to manage and maintain your infrastructure, costs can get out of control, for one. As we enter Cloud 3.0 where the landscape is defined by the open, multi-cloud internet, there is an emerging trend that is worth considering: the rise of specialized cloud providers.

Today, I’m sharing how software as a service (SaaS) startups and modern businesses can take advantage of these highly-focused, tailored services, each specializing and excelling in specific areas like cloud storage, content delivery, cloud compute, and more. Building on a specialized stack offers more control, return on investment, and flexibility, while being able to achieve the same performance you expect from hyperscaler infrastructure.

From a cost of goods sold perspective, AWS pricing wasn’t a great fit. From an engineering perspective, we didn’t want a net-new platform. So the fact that we got both with Backblaze—a drop-in API replacement with a much better cost structure—it was just a no-brainer.

—Rory Petty, Co-Founder & CTO, Tribute

The Rise of Specialized Cloud Providers

Specialized providers—including content delivery networks (CDNs) like Fastly, bunny.net, and Cloudflare, as well as cloud compute providers like Vultr—offer services that focus on a particular area of the infrastructure stack. Rather than trying to be everything to everyone, like the hyperscalers of Cloud 2.0, they do one thing and do it really well. Customers get best-of-breed services that allow them to build a tech stack tailored to their needs.

Use Cases for Specialized Cloud Providers

There are a number of businesses that might benefit from switching from hyperscalers to specialized cloud providers, including:

Businesses dealing with large amounts of data that need an easy cost-effective way of storing their data.
Media streaming applications that require fast and reliable content delivery.
SaaS companies looking to optimize their cloud budget without sacrificing performance.
Platforms that require robust and affordable server hosting, like in the gaming industry.

In order for businesses to take advantage of the benefits (since most applications rely on more than just one service), these services must work together seamlessly.

Let’s Take a Closer Look at How Specialized Stacks Can Work For You

If you’re wondering how exactly specialized clouds can “play well with each other,” we ran a whole series of application storage webinars that talk through specific examples and uses cases. I’ll share what’s in it for you below.

1. Low Latency Multi-Region Content Delivery with Fastly and Backblaze

Did you know a 100-millisecond delay in website load time can hurt conversion rates by 7%? In this session, Pat Patterson from Backblaze and Jim Bartos from Fastly discuss the importance of speed and latency in user experience. They highlight how Backblaze’s B2 Cloud Storage and Fastly’s content delivery network work together to deliver content quickly and efficiently across multiple regions. Businesses can ensure that their content is delivered with low latency, reducing delays and optimizing user experience regardless of the user’s location.

2. Scaling Media Delivery Workflows with bunny.net and Backblaze

Delivering content to your end users at scale can be challenging and costly. Users expect exceptional web and mobile experiences with snappy load times and zero buffering. Anything less than an instantaneous response may cause them to bounce.

In this webinar, Pat Patterson demonstrates how to efficiently scale your content delivery workflows from content ingestion, transcoding, storage, to last-mile acceleration via bunny.net CDN. Pat demonstrates how to build a video hosting platform called “Cat Tube” and shows how to upload a video and play it using HTML5 video element with controls. Watch below and download the demo code to try it yourself.

3. Balancing Cloud Cost and Performance with Fastly and Backblaze

With a global economic slowdown, IT and development teams are looking for ways to slash cloud budgets without compromising performance. E-commerce, SaaS platforms, and streaming applications all rely on high-performant infrastructure, but balancing bandwidth and storage costs can be challenging. In this 45-minute session, we explored how to recession-proof your growing business with key cloud optimization strategies, including ways to leverage Fastly’s CDN to balance bandwidth costs while avoiding performance tradeoffs.

4. Reducing Cloud OpEx Without Sacrificing Performance and Speed

Greg Hamer from Backblaze and DJ Johnson from Vultr explore the benefits of building on best-of-breed, specialized cloud stacks tailored to your business model, rather than being locked into traditional hyperscaler infrastructure. They cover real-world use cases, including:

How Can Stock Photo broke free from AWS and reduced their cloud bill by 55% while achieving 4x faster generation.
How Monument Labs launched a new cloud-based photo management service to 25,000+ users.
How Black.ai processes 1000s of files simultaneously, with a significant reduction of infrastructure costs.

5. Leveling Up a Global Gaming Platform while Slashing Cloud Spend by 85%

James Ross of Nodecraft, an online gaming platform that aims to make gaming online easy, shares how he moved his global game server platform from Amazon S3 to Backblaze B2 for greater flexibility and 85% savings on storage and egress. He discusses the challenges of managing large files over the public internet, which can result in expensive bandwidth costs. By storing game titles on Backblaze B2 and delivering them through Cloudflare’s CDN, they achieve reduced latency since games are cached at the edge, and pay zero egress fees thanks to the Bandwidth Alliance. Nodecraft also benefited from Universal Data Migration, which allows customers to move large amounts of data from any cloud services or on-premises storage to Backblaze’s B2 Cloud Storage, managed by Backblaze and free of charge.

Migrating From a Hyperscaler

Though it may seem daunting to transition from a hyperscaler to a specialized cloud provider, it doesn’t have to be. Many specialized providers offer tools and services to make the transition as smooth as possible.

S3-compatible APIs, SDKs, CLI: Interface with storage as you would with Amazon S3—switching can be as easy as dropping in a new storage target.
Universal Data Migration: Free and fully managed migrations to make switching as seamless as possible.
Free egress: Move data freely with the Bandwidth Alliance and other partnerships between specialized cloud storage providers.

As the decision maker at your growing SaaS company, it’s worth considering whether a specialized cloud stack could be a better fit for your business. By doing so you could potentially unlock cost savings, improve performance, and gain flexibility to adapt your services to your unique needs. The one-size-fits-all is no longer the only option out there.

Want to Test It Out Yourself?

Take a proactive approach to cloud cost management: Get 10GB free to test and validate your proof of concept (POC) with Backblaze B2. All it takes is an email to get started.

The post The Power of Specialized Cloud Providers: A Game Changer for SaaS Companies appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

The Free Credit Trap: Building SaaS Infrastructure for Long-Term Sustainability

2023-05-23 Amrit Singh

Post Syndicated from Amrit Singh original https://www.backblaze.com/blog/the-free-credit-trap-building-saas-infrastructure-for-long-term-sustainability/

In today’s economic climate, cost cutting is on everyone’s mind, and businesses are doing everything they can to save money. But, it’s equally important that they can’t afford to compromise the integrity of their infrastructure or the quality of the customer experience. As a startup, taking advantage of free cloud credits from cloud providers like Amazon AWS, especially at a time like this, seems enticing.

Using those credits can make sense, but it takes more planning than you might think to use them in a way that allows you to continue managing cloud costs once the credits run out.

In this blog post, I’ll walk through common use cases for credit programs, the risks of using credits, and alternatives that help you balance growth and cloud costs.

The True Cost of “Free”

This post is part of a series exploring free cloud credits and the hidden complexities and limitations that come with these offers. Check out our previous installments:

The Shift to Cloud 3.0

As we see it, there have been three stages of “The Cloud” in its history:

Phase 1: What is the Cloud?

Starting around when Backblaze was founded in 2007, the public cloud was in its infancy. Most people weren’t clear on what cloud computing was or if it was going to take root. Businesses were asking themselves, “What is the cloud and how will it work with my business?”

Phase 2: Cloud = Amazon Web Services

Fast forward to 10 years later, and AWS and “The Cloud” started to become synonymous. Amazon had nearly 50% of market share of public cloud services, more than Microsoft, Google, and IBM combined. “The Cloud” was well-established, and for most folks, the cloud was AWS.

Phase 3: Multi-Cloud

Today, we’re in Phase 3 of the cloud. “The Cloud” of today is defined by the open, multi-cloud internet. Traditional cloud vendors are expensive, complicated, and seek to lock customers into their walled gardens. Customers have come to realize that (see below) and to value the benefits they can get from moving away from a model that demands exclusivity in cloud infrastructure.

An image displaying a Tweet from user Philo Hermans @Philo01 that says

I migrated most infrastructure away from AWS. Now that I think about it, those AWS credits are a well-designed trap to create a vendor lock in, and once your credits expire and you notice the actual cost, chances are you are in shock and stuck at the same time (laughing emoji). — Source.

In Cloud Phase 3.0, companies are looking to reign in spending, and are increasingly seeking specialized cloud providers offering affordable, best-of-breed services without sacrificing speed and performance. How do you balance that with the draw of free credits? I’ll get into that next, and the two are far from mutually exclusive.

Getting Hooked on Credits: Common Use Cases

So, you have $100k in free cloud credits from AWS. What do you do with them? Well, in our experience, there are a wide range of use cases for credits, including:

App development and testing: Teams may leverage credits to run an app development proof of concept (PoC) utilizing Amazon EC2, RDS, and S3 for compute, database, and storage needs, for example, but without understanding how these will scale in the longer term, there may be risks involved. Spinning up EC2 instances can quickly lead to burning through your credits and getting hit with an unexpected bill.
Machine learning (ML): Machine learning models require huge amounts of computing power and storage. Free cloud credits might be a good way to start, but you can expect them to quickly run out if you’re using them for this use case.
Data analytics: While free cloud credits may cover storage and computing resources, data transfer costs might still apply. Analyzing large volumes of data or frequently transferring data in and out of the cloud can lead to unexpected expenses.
Website hosting: Hosting your website with free cloud credits can eliminate the up front infrastructure spend and provide an entry point into the cloud, but remember that when the credits expire, traffic spikes you should be celebrating can crater your bottom line.
Backup and disaster recovery: Free cloud credits may have restrictions on data retention, limiting the duration for which backups can be stored. This can pose challenges for organizations requiring long-term data retention for compliance or disaster recovery purposes.

All of this is to say: Proper configuration, long-term management and upkeep, and cost optimization all play a role on how you scale on monolith platforms. It is important to note that the risks and benefits mentioned above are general considerations, and specific terms and conditions may vary depending on the cloud service provider and the details of their free credit offerings. It’s crucial to thoroughly review the terms and plan accordingly to maximize the benefits and mitigate the risks associated with free cloud credits for each specific use case. (And, given the complicated pricing structures we mentioned before, that might take some effort.)

Monument Uses Free Credits Wisely

Monument, a photo management service with a strong focus on security and privacy, utilized free startup credits from AWS. But, they knew free credits wouldn’t last forever. Monument’s co-founder, Ercan Erciyes, realized they’d ultimately lose money if they built the infrastructure for Monument Cloud on AWS.

He also didn’t want to accumulate tech debt and become locked in to AWS. Rather than using the credits to build a minimum viable product as fast as humanly possible, he used the credits to develop the AI model, but not to build their infrastructure. Read more about how they put AWS credits to use while building infrastructure that could scale as they grew.

The Risks of AWS Credits: Lessons from Founders

If you’re handed $100,000 in credits, it’s crucial to be aware of the risks and implications that come along with it. While it may seem like an exciting opportunity to explore the capabilities of the cloud without immediate financial constraints, there are several factors to consider:

The temptation to overspend: With a credit balance at your disposal just waiting to be spent, there is a possibility of underestimating the actual costs of your cloud usage. This can lead to a scenario where you inadvertently exhaust the credits sooner than anticipated, leaving you with unexpected expenses that may strain your budget.
The shock of high bills once credits expire: Without proper planning and monitoring of your cloud usage, the transition from “free” to paying for services can result in high bills that catch you off guard. It is essential to closely track your cloud usage throughout the credit period and have a clear understanding of the costs associated with the services you’re utilizing. Or better yet, use those credits for a discrete project to test your PoC or develop your minimum viable product, and plan to build your long-term infrastructure elsewhere.
The risk of vendor lock-in: As you build and deploy your infrastructure within a specific cloud provider’s ecosystem, the process of migrating to an alternative provider can seem complex and can definitely be costly (shameless plug: at Backblaze, we’ll cover your migration over 50TB). Vendor lock-in can limit your flexibility, making it challenging to adapt to changing business needs or take advantage of cost-saving opportunities in the future.

The problems are nothing new for founders, as the online conversation bears out.

First, there’s the old surprise bill:

A Tweet from user Ajul Sahul @anjuls that says

Similar story, AWS provided us free credits so we though we will use it for some data processing tasks. The credit expired after one year and team forgot about the abandoned resources to give a surprise bill. Cloud governance is super importance right from the start. — Source.

Even with some optimization, AWS cloud spend can still be pretty “obscene” as this user vividly shows:

A Tweet from user DHH @dhh that says

We spent $3,201,564.24 on cloud in 2022 at @37signals, mostly AWS. $907,837.83 on S3. $473,196.30 on RDS. $519,959.60 on OpenSearch. $123,852.30 on Elasticache. This is with long commits (S3 for 4 years!!), reserved instances, etc. Just obscene. Will publish full accounting soon. — Source.

There’s the founder raising rounds just to pay AWS bills:

A Tweet from user Guille Ojeda @itsguilleojeda that says

Tech first startups raise their first rounds to pay AWS bills. By the way, there's free credits, in case you didn't know. Up to $100k. And you'll still need funding. — Source.

Some use the surprise bill as motivation to get paying customers.

A tweet from user Leo Guinan @leo_guinan that says (I also got my AWS bill for last month and it was about 4x what I was expecting after my credits ran out, so a little more push to get paid users [laughing emoji]) — Source.

Lastly, there’s the comic relief:

A tweet from user Mrinal Wahal @MrinalWahal that reads

Yeah high credit card bills are scary but have you forgotten turning off your AWS instances? — Source.

Strategies for Balancing Growth and Cloud Costs

Where does that leave you today? Here are some best practices startups and early founders can implement to balance growth and cloud costs:

Establishing a cloud cost management plan early on.
Monitoring and optimizing cloud usage to avoid wasted resources.
Leveraging multiple cloud providers.
Moving to a new cloud provider altogether.
Setting aside some of your credits for the migration.

1. Establishing a Cloud Cost Management Plan

Put some time into creating a well-thought-out cloud cost management strategy from the beginning. This includes closely monitoring your usage, optimizing resource allocation, and planning for the expiration of credits to ensure a smooth transition. By understanding the risks involved and proactively managing your cloud usage, you can maximize the benefits of the credits while minimizing potential financial setbacks and vendor lock-in concerns.

2. Monitoring and Optimizing Cloud Usage

Monitoring and optimizing cloud usage plays a vital role in avoiding wasted resources and controlling costs. By regularly analyzing usage patterns, organizations can identify opportunities to right-size resources, adopt automation to reduce idle time, and leverage cost-effective pricing options. Effective monitoring and optimization ensure that businesses are only paying for the resources they truly need, maximizing cost efficiency while maintaining the necessary levels of performance and scalability.

3. Leveraging Multiple Cloud Providers

By adopting a multi-cloud strategy, businesses can diversify their cloud infrastructure and services across different providers. This allows them to benefit from each provider’s unique offerings, such as specialized services, geographical coverage, or pricing models. Additionally, it provides a layer of protection against potential service disruptions or price increases from a single provider. Adopting a multi-cloud approach requires careful planning and management to ensure compatibility, data integration, and consistent security measures across multiple platforms. However, it offers the flexibility to choose the best-fit cloud services from different providers, reducing dependency on a single vendor and enabling businesses to optimize costs while harnessing the capabilities of various cloud platforms.

4. Moving to a New Cloud Provider Altogether

If you’re already deeply invested in a major cloud platform, shifting away can seem cumbersome, but there may be long-term benefits that outweigh the short term “pains” (this leads into the shift to Cloud 3.0). The process could involve re-architecting applications, migrating data, and retraining personnel on the new platform. However, factors such as pricing models, performance, scalability, or access to specialized services may win out in the end. It’s worth noting that many specialized providers have taken measures to “ease the pain” and make the transition away from AWS more seamless without overhauling code. For example, at Backblaze, we developed an S3 compatible API so switching providers is as simple as dropping in a new storage target.

5. Setting Aside Credits for the Migration

By setting aside credits for future migration, businesses can ensure they have the necessary resources to transition to a different provider without incurring significant up front expenses like egress fees to transfer large data sets. This strategic allocation of credits allows organizations to explore alternative cloud platforms, evaluate their pricing models, and assess the cost-effectiveness of migrating their infrastructure and services without worrying about being able to afford the migration.

Welcome to Cloud 3.0: Alternatives to AWS

In 2022, David Heinemeier Hansson, the creator of Basecamp and Hey, announced that he was moving Hey’s infrastructure from AWS to on-premises. Hansson cited the high cost of AWS as one of the reasons for the move. His estimate? “We stand to save $7m over five years from our cloud exit,” he said.

Going back to on-premises solutions is certainly one answer to the problem of AWS bills. In fact, when we started designing Backblaze’s Personal Backup solution, we were faced with the same problem. Hosting data storage for our computer backup product on AWS was a non-starter—it was going to be too expensive, and our business wouldn’t be able to deliver a reasonable consumer price point and be solvent. So, we didn’t just invest in on-premises resources: We built our own Storage Pods, the first evolution of the Backblaze Storage Cloud.

But, moving back to on-premises solutions isn’t the only answer—it’s just the only answer if it’s 2007 and your two options are AWS and on-premises solutions. The cloud environment as it exists today has better choices. We’ve now grown that collection of Storage Pods into the Backblaze B2 Storage Cloud, which delivers performant, interoperable storage at one-fifth the cost of AWS. And, we offer free egress to our content delivery network (CDN) and compute partners. Backblaze may provide an even more cost-effective solution for mid-sized SaaS startups looking to save on cloud costs while maintaining speed and performance.

As we transition to Cloud 3.0 in 2023 and beyond, companies are expected to undergo a shift, reevaluating their cloud spending to ensure long-term sustainability and directing saved funds into other critical areas of their businesses. The age of limited choices is over. The age of customizable cloud integration is here.

So, shout out to David Heinemeier Hansson: We’d love to chat about your storage bills some time.

Want to Test It Yourself?

Take a proactive approach to cloud cost management: If you’ve got more than 50TB of data storage or want to check out our capacity-based pricing model, B2 Reserve, contact our Sales Team to test a PoC for free with Backblaze B2.

And, for the streamlined, self–serve option, all you need is an email to get started today.

FAQs About Cloud Spend

If you’re thinking about moving to Backblaze B2 after taking AWS credits, but you’re not sure if it’s right for you, we’ve put together some frequently asked questions that folks have shared with us before their migrations:

My cloud credits are running out. What should I do?

Backblaze’s Universal Data Migration service can help you off-load some of your data to Backblaze B2 for free. Speak with a migration expert today.

AWS has all of the services I need, and Backblaze only offers storage. What about the other services I need?

Shifting away from AWS doesn’t mean ditching the workflows you have already set up. You can migrate some of your data storage while keeping some on AWS or continuing to use other AWS services. Moreover, AWS may be overkill for small to midsize SaaS businesses with limited resources.

How should I approach a migration?

Identify the specific services and functionalities that your applications and systems require, such as CDN for content delivery or compute resources for processing tasks. Check out our partner ecosystem to identify other independent cloud providers that offer the services you need at a lower cost than AWS.

What CDN partners does Backblaze have?

With the ease of use, predictable pricing, zero egress, our joint solutions are perfect for businesses looking to reduce their IT costs, improve their operational efficiency, and increase their competitive advantage in the market. Our CDN partners include Fastly, bunny.net, and Cloudflare. And, we extend free egress to joint customers.

What compute partners does Backblaze have?

Our compute partners include Vultr and Equinix Metal. You can connect Backblaze B2 Cloud Storage with Vultr’s global compute network to access, store, and scale application data on-demand, at a fraction of the cost of the hyperscalers.

The post The Free Credit Trap: Building SaaS Infrastructure for Long-Term Sustainability appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

CDN Bandwidth Fees: What You Need to Know

2023-03-16 Molly Clancy

Post Syndicated from Molly Clancy original https://www.backblaze.com/blog/cdn-bandwidth-fees-what-you-need-to-know/

A decorative image showing a cloud with three dollar signs and the word "Egress", three CDN nodes, and a series of 0s and 1s representing data.

You know that sinking feeling you get in your stomach when you receive a hefty bill you weren’t expecting? That is what some content delivery network (CDN) customers experience when they get slammed with bandwidth fees without warning. To avoid that sinking feeling, it’s important to understand how bandwidth fees work. It’s critical to know precisely what you are paying for and how you use the cloud service before you get hit with an eye-popping bill you can’t pay.

A content delivery network is an excellent way to speed up your website and improve performance and SEO, but not all vendors are created equal. Some charge more for data transfer than others. As the leading specialized cloud storage provider, we have developed partnerships with many top CDN providers, giving us the advantage of fully understanding how their services work and what they charge.

So, let’s talk about bandwidth fees and how they work to help you decide which CDN provider is right for you.

What Are CDN Bandwidth Fees?

Most CDN cloud services work like this: You can configure the CDN to pull data from one or more origins (such as a Backblaze B2 Cloud Storage Bucket) for free or for a flat fee, and then you’re charged fees for usage, namely when data is transferred when a user requests it. These are known as bandwidth, download, or data transfer fees. (We’ll use these terms somewhat interchangeably.) Typically, storage providers also charge egress fees when data is called up by a CDN.

The fees aren’t a problem in and of themselves, but if you don’t have a good understanding of them, successes you should be celebrating can be counterbalanced by overhead. For example, let’s say you’re a small game-sharing platform, and one of your games goes viral. Bandwidth and egress fees can add up quickly in a case like this. CDN providers charge in arrears, meaning they wait to see how much of the data was accessed each month, and then they apply their fees.

Thus, monitoring and managing data transfer fees can be incredibly challenging. Although some services offer a calculation tool, you could still receive a shock bill at the end of the month. It’s important to know exactly how these fees work so you can plan your workflows better and strategically position your content where it will be the most efficient.

How Do CDN Bandwidth Fees Work?

Data transfer occurs when data leaves the network. An example might be when your application server serves an HTML page to the browser or your cloud object store serves an image, in each case via the CDN. Another example is when your data is moved to a different regional server within the CDN to be more efficiently accessed by users close to it.

A decorative photo of a sign that says "$5 fee per usage for non-members."

There are dozens of instances where your data may be accessed or moved, and every bit adds up. Typically, CDN vendors charge a fee per GB or TB up to a specific limit. Once you hit these thresholds, you may advance up another pricing tier. A busy month could cost you a mint, and traffic spikes for different reasons in different industries—like a Black Friday rush for an e-commerce site or around events like the Super Bowl for a sports betting site, for example.

To give you some perspective, Apple spent more than $50 million in data transfer fees in a single year, Netflix $15 million, and Adobe and Salesforce spent more than $7 million according to The Information. You can see how quickly things add up before breaking the bank.

Price Comparison of Bandwidth Fees Across CDN Services

To get a better sense of how each CDN service charges for bandwidth, let’s explore the top providers and what they offer and charge.

As part of the Bandwidth Alliance, some of these vendors have agreed to discount customer data transfer fees when transferring one or both ways between member companies. What’s more, Backblaze offers free egress or discounts above and beyond what folks get with the Bandwidth Alliance for customers.

Note: Prices are as published by vendors as of 3/16/2023.

Fastly

Fastly offers edge caches to deliver content instantly around the globe. The company also offers SSL services for $20/per domain per month. They have various additional add-ons for things like web application firewalls (WAFs), managed rules, DDoS protection, and their Gold support.

Fastly bases its pricing structure on usage. They have three tiered plans:

Essential: up to 3TB of global delivery per month.
Professional: up to 10TB of global delivery per month.
Enterprise: unlimited global delivery.

They bill customers a minimum of $50/month for bandwidth and request usage.

bunny.net

bunny.net labels itself as the world’s lightning-fast CDN service. They price their CDN services based on region. For North America and Europe, prices begin at $0.01/GB per month. For companies with more than 100TB per month, you must call for pricing. If you have high bandwidth needs, bunny.net offers fewer PoPs (Points of Presence) for $0.005/GB per month.

Cloudflare

Cloudflare offers a limited free plan for hobbyists and individuals. They also have tiered pricing plans for businesses called Pro, Business, and Enterprise. Instead of charging bandwidth fees, Cloudflare opts for the monthly subscription model, which includes everything.

The Pro plan costs $20/month (for 100MB of upload). The Business plan is $200/month (for 200MB of upload). You must call to get pricing for the enterprise plan (for 500MB of upload).

Cloudflare also offers dozens of add-ons for load balancing, smart routing, security, serverless functions, etc. Each one costs extra per month.

AWS Cloudfront

AWS Cloudfront is Amazon’s CDN and is tightly integrated with its AWS services. The company offers tiered pricing based on bandwidth usage. The specifics are as follows for North America:

$0.085/GB up to the first 10TB per month.
$0.080/GB for the next 40TB per month.
$0.060/GB for the next 100TB per month.
$0.040/GB for the next 350TB per month.
$0.030/GB for the next 524TB per month.

Their pricing extends up to 5PB per month, and there are different pricing breakdowns for different regions.

Amazon offers special discounts for high-data users and those customers who use AWS as their application storage container. You can also purchase add-on products that work with the CDN for media streaming and security.

A decorative image showing a portion of the earth viewed from space with lights clustered around city centers. — Sure it’s pretty. Until you know all those lights represent possible fees.

Google Cloud CDN

Google Cloud CDN offers fast and reliable content delivery services. However, Google charges bandwidth, cache egress fees, and for cache misses. Their pricing structure is as follows:

Cache Egress: $0.02–$0.20 per GB.
Cache Fill: $0.01–$0.04 per GB.
Cache Lookup Requests: $0.0075 per 10,000 requests.

Cache egress fees are priced per region, and in the U.S., they start at $0.08 for the first 10TB. Between 10–150TB costs $0.055, and beyond 500TB, you have to call for pricing.
Google charges $0.01 per GB for cache fill services.

Microsoft Azure

The Azure content delivery network is Microsoft’s offering that promises speed, reliability, and a high level of security.

Azure offers a limited free account for individuals to play around with. For business customers, they offer the following price structure:

Depending on the zone, the price will vary for data transfer. For Zone One, which includes North America, Europe, Middle East, and Africa, pricing is as follows:

First 10TB: $0.158/GB per month.
Next 40TB: $0.14/GB per month.
Next 100TB: $0.121/GB per month.
Next 350TB: $0.102/GB per month.
Next 500TB: $0.093/GB per month.
Next 4,000TB: $0.084/GB per month.

Azure charges $.60 per 1,000,000,000 requests per month and $1 for rules per month. You can also purchase WAF services and other products for an additional monthly fee.

How to Save on Bandwidth Fees

A CDN can significantly enhance the performance of your website or web application and is well worth the investment. However, finding ways to save is helpful. Many of the CDN providers listed above are members of the Bandwidth Alliance and have agreed to offer discounted rates for bandwidth and egress fees. Another way to save money each month is to find affordable origin storage that works seamlessly with your chosen CDN provider. Here at Backblaze, we think the world needs lower egress fees, and we offer free egress between Backblaze B2 and many CDN partners like Fastly, bunny.net, and Cloudflare.

The post CDN Bandwidth Fees: What You Need to Know appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Backblaze Joins the CDN Alliance

2023-03-06 Elton Carneiro

Post Syndicated from Elton Carneiro original https://www.backblaze.com/blog/backblaze-joins-the-cdn-alliance/

A decorative image that features the Backblaze logo and the CDN Alliance logo.

As the leading specialized storage cloud platform, Backblaze is a big proponent of the open, collaborative nature of independent cloud service providers. From our participation in the Bandwidth Alliance to our large ecosystem of partners, we’re focused on what we call “Phase Three” of the cloud. What’s happening in Phase Three? The age of walled gardens, hidden fees, and out of control egress fees driven by the hyperscalers is in the past. Today’s specialized cloud solutions are oriented toward what’s best for users—an open, multi-cloud internet.

Which is why I’m particularly happy to announce today that we’ve joined the CDN Alliance, a nonprofit organization and community of industry leaders focused on ensuring that the content delivery network (CDN) industry is evolving in a way that best serves businesses distributing content of every type around the world—from streaming media to stock image resources to e-commerce and more.

The majority of the content we consume today on our many devices and platforms is being delivered through a CDN. Being part of the CDN Alliance allows Backblaze to collaborate and drive innovation with our peers to ensure that everyone’s content experience only gets better.

Through participation in and sponsorships of joint events, panels, and gatherings, we look forward to working with the CDN Alliance on the key challenges facing the industry, including availability, scalability, reliability, privacy, security, sustainability, interoperability, education, certification, regulations, and numerous others. Check out the CDN Alliance and its CDN Community for more info.

For more resources on CDN integrations with Backblaze B2 Cloud Storage you can read more about our top partners here.

The post Backblaze Joins the CDN Alliance appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

AWS CloudFront vs. bunny.net: How Do the CDNs Compare?

2023-02-23 Molly Clancy

Post Syndicated from Molly Clancy original https://www.backblaze.com/blog/aws-cloudfront-vs-bunny-net-how-do-the-cdns-compare/

Remember the story about the hare and the tortoise? Well, this is not that story, but we are comparing bunny.net with another global content delivery network (CDN) provider, AWS CloudFront, to see how the two stack up. When you think of rabbits, you automatically think of speed, but a CDN is not just about speed; sometimes, other factors “win the race.”

As a leading specialized cloud storage provider, we provide application storage that folks use with many of the top CDNs. Working with these vendors allows us deep insight into the features of each platform so we can share the information with you. Read on to get our take on these two leading CDNs.

What Is a CDN?

A CDN is a network of servers dispersed around the globe that host content closer to end users to speed up website performance. Let’s say you keep your website content on a server in New York City. If you use a CDN, when a user in Las Vegas calls up your website, the request can pull your content from a server in, say, Phoenix instead of going all the way to New York. This is known as caching. A CDN’s job is to reduce latency and improve the responsiveness of online content.

Join the Webinar

Tune in to our webinar on Tuesday, February 28, 2022 at 10:00 a.m. PT/1:00 p.m. ET to learn how you can leverage bunny.net’s CDN and Backblaze B2 to accelerate content delivery and scale media workflows with zero-cost egress.

CDN Use Cases

Before we compare these two CDNs, it’s important to understand how they might fit into your overall tech stack. Some common use cases for a CDN include:

Website Reliability: If your website server goes down and you have a CDN in place, the CDN can continue to serve up static content to your customers. Not only can a CDN speed up your website performance tremendously, but it can also keep your online presence up and running, keeping your customers happy.
App Optimization: Internet apps use a lot of dynamic content. A CDN can optimize that content and keep your apps running smoothly without any glitches, regardless of where in the world your users access them.
Streaming Video and Media: Streaming media is essential to keep customers engaged these days. Companies that offer high-resolution video services need to know that their customers won’t be bothered by buffering or slow speeds. A CDN can quickly solve this problem by hosting 8K videos and delivering reliable streams across the globe.
Scalability: Various times of the year are busier than others—think Black Friday. If you want the ultimate scalability, a CDN can help buffer the traffic coming into your website and ease the burden on the origin server.
Gaming: Video game fans know nothing is worse than having your favorite online duel lock up during gameplay. Video providers use CDNs to host high-resolution content, so all their games run flawlessly to keep players engaged. They also use CDN platforms to roll out new updates and security patches without any limits.
Images/E-Commerce: Online retailers typically host thousands of images for their products so you can see every color, angle, and option available. A CDN is an excellent way to instantly deliver crystal clear, high-quality images without any speed issues or quality degradation.
Improved Security: CDN services often come with beefed-up security protocols, including distributed denial-of-service (DDoS) prevention across the platform and detection of suspicious behavior on the network.

Speed Tests: How Fast Can You Go?

Speed tests are a valuable tool that businesses can use to gauge site performance, page load times, and customer experience. You can use dozens of free online speed tests to evaluate time to first byte (TTFB) and the number of requests (how many times the browser has to make the request before the page loads). Some speed tests show other more advanced metrics.

A CDN is one aspect that can affect speed and performance, but there are other factors at play as well. A speed test can help you identify bottlenecks and other issues.

Some of the most popular tools are:

Comparing bunny.net vs. AWS CloudFront

Although bunny.net and AWS CloudFront provide CDN services, their features and technology work differently. You will want all of the details when deciding which CDN is right for your application.

bunny.net is a powerfully simple CDN that delivers content at lightning speeds across the globe. The service is scalable, affordable, and secure. They offer edge storage, optimization services, and DNS resources for small to large companies.

AWS CloudFront is a global CDN designed to work primarily with other AWS services. The service offers robust cloud-based resources for enterprise businesses.

Let’s compare all the features to get a good sense of how each CDN option stacks up. To best understand how the two CDNs compare, we’ll look at different aspects of each one so you can decide which option works best for you, including:

Network
Cache
Compression
DDoS Protection
Integrations
TLS Protocols
CORS Support
Signed Exchange Support
Pricing

Network

Distribution points are the number of servers within a CDN network. These points are distributed throughout the globe to reach users anywhere. When users request content through a website or app, the CDN connects them to the closest distribution point server to deliver the video, image, script, etc., as quickly as possible.

bunny.net

bunny.net has 114 global distribution points (also called points of presence or PoPs) in 113 cities and 77 countries. For high-bandwidth users, they also offer a separate, cost-optimized network of 10 PoPs. They don’t charge any request fees and offer multiple payment options.

AWS CloudFront

Currently, AWS CloudFront advertises that they have roughly 450 distribution points in 90 cities in 48 countries.

Our Take

While AWS CloudFront has many points in some major cities, bunny.net has a wider global distribution—AWS CloudFront covers 90 cities, and bunny.net covers 114. And bunny.net ranks first on CDNPerf, a third-party CDN performance analytics and comparison tool.

Cache

Caching files allows a CDN to serve up copies of your digital content from distribution points closer to end users, thus improving performance and reliability.

bunny.net

With their Origin Shield feature, when CDN nodes have a cache miss (meaning the content an end user wants isn’t at the node closest to them), the network directs the request to another node versus the origin. They offer Perma-Cache where you can permanently store your files at the edge for a 100% cache hit rate. They also recently introduced request coalescing, where requests by different users for the same file are combined into one request. Request coalescing works well for streaming content or large objects.

AWS CloudFront

AWS CloudFront uses caching to reduce the load of requests to your origin store. When a user visits your website, AWS CloudFront directs them to the closest edge cache so they can view content without any wait. You can configure AWS CloudFront’s cache settings using the backend interface.

Our Take

Caching is one of bunny.net’s strongest points of differentiation, primarily around static content. They also offer dynamic caching with one-click configuration by query string, cookie, and state cache as well as cache chunking for video delivery. With their Perma-Cache and request coalescing, their capabilities for dynamic caching are improving.

Compression

Compressing files makes them smaller, which saves space and makes them load faster. Many CDNs allow compression to maximize your server space and decrease page load times. The two services are on par with each other when it comes to compression.

bunny.net

The bunny.net system automatically optimizes/compresses images and minifies CSS and JavaScript files to improve performance. Images are compressed by roughly 80%, improving load times by up to 90%. bunny.net supports both .gzip and .br (Brotli) compression formats. The bunny.net optimizer can compress images and optimize files on the fly.

AWS CloudFront

AWS CloudFront allows you to compress certain file types automatically and use them as compressed objects. The service supports both .gzip and .br compression formats.

DDoS Protection

Distributed denial of service (DDoS) attacks can overwhelm a website or app with too much traffic causing it to crash and interrupting actual website traffic. CDNs can help prevent DDoS attacks.

bunny.net

bunny.net stops DDoS attacks via a layered DDoS protection system that stops both network and HTTP layer attacks. Additionally, a number of checks and balances—like download speed limits, connection counts for IP addresses, burst requests, and geoblocking—can be configured. You can hide IP addresses and use edge rules to block requests.

AWS CloudFront

AWS CloudFront uses security technology called AWS Shield designed to prevent DDoS and other types of attacks.

Our Take

As an independent, specialized CDN service, bunny.net has put most of their focus on being a standout when it comes to core CDN tasks like caching static content. That’s not to say that their security services are lacking, but just that their security capabilities are sufficient to meet most users’ needs. AWS Shield is a specialized DDoS protection software, so it is more robust. However, that robustness comes at an added cost.

Integrations

Integrations allow you to customize a product or service using add-ons or APIs to extend the original functionality. One popular tool we’ll highlight here is Terraform, a tool that allows you to provision infrastructure as code (IaC).

Terraform

HashiCorp’s Terraform is a third-party program that allows you to manage your CDN, store source code in repositories like GitHub, track each version, and even roll back to an older version if needed. You can use Terraform to configure bunny.net CDN pull zones only. You can use Terraform with AWS CloudFront by editing configuration files and installing Terraform on your local machine.

TLS Protocols

Transport Layer Security (TLS), formerly known as secure sockets layer (SSL), are encryption protocols used to protect website data. Whenever you see the lock sign on your internet browser, you are using a website that is protected by an TLS (HTTPS). Both services conform adequately to TLS standards.

bunny.net offers customers free TLS with its CDN service. They make setting it up a breeze (two clicks) in the backend of your account. You also have the option of installing your own SSL. They provide helpful step-by-step instructions on how to install it.

Because AWS CloudFront assigns a unique URL for your CDN content, you can use the default TLS certificate installed on the server or your own TLS. If you use your own, you should consult the explicit instructions for key length and install it correctly. You also have the option of using an Amazon TLS certificate.

CORS Support

Cross-origin resource sharing (CORS) is a service that allows your internet browser to deliver content from different sources seamlessly on a single webpage or app. Default security settings normally reject certain items if they come from a different origin and they may block the content. CORS is a security exception that allows you to host various types of content from other servers and deliver them to your users without any errors.

bunny.net and AWS CloudFront both offer customers CORS support through configurable CORS headers. Using CORS, you can host images, scripts, style sheets, and other content in different locations without any issues.

Signed Exchange Support

Signed exchange (SXG) is a service that allows search engines to find and serve cached pages to users in place of the original content. SXG speeds up performance and improves SEO in the process. The service uses cryptography to authenticate the origin of digital assets.

Both bunny.net and AWS CloudFront support SXG. bunny.net supports signed exchange through its token authentication system. The service allows you to enable, configure, and generate tokens and assign them an expiration date to stop working when you want.

AWS CloudFront supports SXG through its security settings. When configuring your settings, you can choose which cipher to use to verify the origin of the content.

Pricing

bunny.net

bunny.net offers simple, affordable, region-based pricing starting at $0.01/GB in the U.S. For high-bandwidth projects, their volume pricing starts at $0.005/GB for the first 500TB.

AWS CloudFront

AWS CloudFront offers a free plan, including 1TB of data transfer out, 10,000,000 HTTP or HTTPS requests, and 2,000,000 functions invocations each month.

AWS CloudFront’s paid service is tiered based on bandwidth usage. AWS CloudFront’s pricing starts at $0.085 per GB up to 10TB in North America. All told, there are seven pricing tiers from 10TB to >5PB. If you stay within the AWS ecosystem, data transfer is free from Amazon S3, their object storage service, however you’ll be charged to transfer data outside of AWS. Each tier is priced by location/country.

Our Take

bunny.net is probably one of the most cost effective CDNs on the market. For example, their traffic pricing for 5TB in Europe or North America is $50 compared to $425 with CloudFront. There are no request fees, you only pay for the bandwidth you actually use. All of their features are included without extra charges. And finally, egress is free between bunny.net and Backblaze B2, if you choose to pair the two services.

Our Final Take

bunny.net’s key advantages are its simplicity, pricing, and customer support. Many of the above features are configured in one-click, giving you advanced capabilities without the headache of trying to figure out complicated provisioning. Their pricing is straightforward and affordable. And, not for nothing, they also offer one-to-one, round-the-clock customer support. If it’s important to you to be able to speak with an expert when you need to, bunny.net is the better choice.

AWS CloudFront offers more robust features, like advanced security services, but those services come with a price tag and you’re on your own when it comes to setting them up properly. AWS also prefers customers to stay within the AWS ecosystem, so using any third-party services outside of AWS can be costly.

If you’re looking for an agnostic, specialized, affordable CDN, bunny.net would be a great fit. If you need more advanced features and have the time, know-how, and money to make them work for you, AWS CloudFront offers those.

CDNs and Cloud Storage

A CDN can boost the speed of your website pages and apps. However, you still need reliable, affordable application storage for the cache to pull from. Pairing robust application storage with a speedy CDN is the perfect solution for improved performance, security, and scalability.

The post AWS CloudFront vs. bunny.net: How Do the CDNs Compare? appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Fastly vs. AWS CloudFront: How Do the CDNs Stack Up?

2023-02-16 Molly Clancy

Post Syndicated from Molly Clancy original https://www.backblaze.com/blog/fastly-vs-aws-cloudfront-how-do-the-cdns-stack-up/

As a leading specialized cloud platform for application storage, we work with a variety of content delivery network (CDN) providers. From this perch, we get to see the specifics on how each operates. Today, we’re sharing those learnings with you by comparing Fastly and AWS CloudFront to help you understand your options when it comes to choosing a CDN.

A Guide to CDNs

This article is the first in a series on all things CDN. We’ll cover how to decide which CDN is best for you, how to decipher pricing, and how to use a video CDN with cloud storage.

If there’s anything you’d like to hear more about when it comes to CDNs, let us know in the comments.

What Is a CDN?

If you run a website or a digital app, you need to ensure that you are delivering your content to your audience as quickly and efficiently as possible to beat out the competition. One way to do this is by using a CDN. A CDN caches all your digital assets like videos, images, scripts, style sheets, apps, etc. Then, whenever a user accesses your content, the CDN connects them with the closest server so that your items load quickly and without any issues. Many CDNs have servers around the globe to offer low-latency data access and drastically improve the responsiveness of your app through caching.

Before you choose a CDN, you need to consider your options. There are dozens of CDNs to choose from, and they all have benefits and drawbacks. Let’s compare Fastly with AWS CloudFront to see which works best for you.

CDN Use Cases

Before we compare these two CDNs, it’s important to understand how they might fit into your overall tech stack. Here are some everyday use cases for a CDN:

Websites: If you have a video- or image-heavy website, you will want to use a CDN to deliver all your content without any delays for your visitors.
Web Applications: A CDN can help optimize your dynamic content and allow your web apps to run flawlessly, regardless of where your users access them.
Streaming Video: Customers expect more from companies these days and will not put up with buffering or intermittent video streaming issues. If you host a video streaming service like Hulu, Netflix, Kanopy, or Amazon, a CDN can solve these problems. You can host high-resolution (8K) video on your CDN and then stream it to your users, offering them a smooth, gapless streaming experience.
Gaming: If you are a “Call of Duty” or “Halo” fan, you know that most video games use high-resolution images and video to provide the most immersive gaming experience possible. Video game providers use CDNs to ensure responsive gameplay without any blips. You can also use a CDN to streamline rolling out critical patches or updates to all your customers without any limits.
E-Commerce Applications: Online retailers typically use dozens of images to showcase their products. If you want to use high-quality images, your website could suffer slow page loads unless you use a CDN to deliver all your photos instantly without any wait.

Need for Speed (Test)

Website developers and owners use speed tests to gauge page load speeds and other aspects affecting the user experience. A CDN is one way to improve your website metrics. You can use various online speed tests that show details like load time, time to first byte (TTFB), and the number of requests (how many times the browser must make the request before the page loads).

A CDN can help improve performance quite a bit, but speed tests are dependent on many factors outside of a CDN. To find out exactly how well your site performs, there are dozens of reputable speed test tools online that you can use to evaluate your site, and then you can make improvements from there. Some of the most popular tools are:

Comparing Fastly vs. AWS CloudFront

Fastly, founded in 2011, has rapidly grown to be a competitive global edge cloud platform and CDN offering international customers a wide variety of products and services. The company’s flagship product is its CDN which offers nearly instant content delivery for companies like The New York Times, Reddit, and Pinterest.

AWS CloudFront is Amazon Web Service’s (AWS) CDN offering. It’s tightly integrated with other AWS products.

To best understand how the two CDNs compare, we’ll look at different aspects of each one so you can decide which option works best for you, including:

Network
Caching
DDoS Protection
Log streaming
Integrations
TLS Protocols
Pricing

Network

CDN networks are made up of distribution points, which are network connections (servers) that allow a CDN to deliver content instantly to users anywhere.

Fastly

Fastly’s network is built fundamentally differently than a legacy CDN. Rather than a wide-ranging network populated with many points of presence (PoPs), Fastly built a stronger network based on fewer, more powerful, and strategically placed PoPs. Fastly promises 233Tbps of connected global capacity with its system of PoPs (as of 9/30/2022).

AWS CloudFront

AWS CloudFront doesn’t share specific capacity figures in terms of terabits per second (Tbps). They keep that claim somewhat vague, advertising “hundreds of terabits of deployed capacity.” But they do advertise that they have roughly 450 distribution points in 90 cities in 48 countries.

Our Take

At first glance, it might seem like more PoPs means a faster, more robust network. Fastly uses a useful metaphor to explain why that’s not true. They compare legacy PoPs to convenience stores—they’re everywhere, but they’re small, meaning that the content your users are requesting may not be there when they need it. Fastly’s PoPs are more like supermarkets—you have a better chance of getting everything you need (your cached content) in one place. It only takes a few milliseconds to get to one of Fastly’s PoPs nowadays (as opposed to when legacy providers like AWS CloudFront built their networks), and there’s much more likelihood that the content you need is going to be housed in that PoP already, instead of needing to be called up from origin storage.

Caching

Caching reduces the number of direct requests to your origin server. A CDN acts as a middleman responding to requests for content on your behalf and directing users to edge caches nearest to the user. When a user calls up your website, the CDN serves up a cached version located on the server closest to them. This feature drastically improves the speed and performance of your website.

Fastly

Fastly uses a process of calculating the Time to Live (TTL) with its caching feature. TTL is the maximum time Fastly will use the content to answer requests before returning to your origin server. You can set various cache settings like purging objects, conditional caching, and assigning different TTLs for cached content through Fastly’s API.

Fastly shows its average cache hit ratio live on its website, which is over 91% at the time of publication. This is the ratio of how many content requests the CDN is able to fill from the cache versus the total number of requests.

Fastly also allows you to automatically compress some file types in gzip and then cache them. You can modify these settings from inside Fastly’s web interface. The service also includes support for Brotli data compression via general availability as of February 7, 2023.

AWS CloudFront

AWS CloudFront routes requests for your content to servers holding a cached version, lessening the burden on your origin container. When users visit your site, the CDN directs them to the closest edge cache for instantaneous page loads. You can change your cache settings in AWS CloudFront’s backend. AWS CloudFront supports compressed files and allows you to store and access gzip and Brotli compressed objects.

Our Take

Fastly does not charge a fee no matter how many times content is purged from the cache, while AWS CloudFront does. And, Fastly can invalidate content in 150 milliseconds, while AWS CloudFront can be 60–120 times slower. Both of these aspects make Fastly better with dynamic content that changes quickly for customers, such as news outlets, social media sites, and e-commerce sites.

DDoS Protection

Distributed denial of service (DDoS) attacks are a serious concern for website and web app owners. A typical attack can interrupt website traffic or crash it completely, making it impossible for your customers to reach you.

Fastly

Fastly relies on its 233Tbps+ (as of 9/30/2022) of globally-distributed network capacity to absorb any DDoS attacks, so they don’t affect customers’ origin content. They also use sophisticated filtering technology to remove malicious requests at the edge before they get close to your origin.

AWS CloudFront

AWS CloudFront is backed by comprehensive security technology designed to prevent DDoS and other types of attacks. Amazon calls its DDoS protection service AWS Shield.

Our Take

Fastly’s next gen web application firewall (WAF) actively filters the correct traffic. More than 90% of their customers use the WAF in active full blocking mode whereas across the industry, only 57% of customers use their WAF in full blocking mode. This means the Fastly WAF works as it should out of the box. Other WAFs require more fine-tuning and advanced rule setting to be as efficient as Fastly’s. Fastly’s WAF can also be deployed anywhere—at the edge, on-premises, or both—whereas most AWS instances are cloud hosted.

Log Streaming

Log streaming enables you to collect logs from your CDN and forward them to specific destinations. They help customers stay on top of up-to-date information about what’s happening within the CDN, including detecting security anomalies.

Fastly

Fastly allows for near real-time visibility into delivery performance with real-time logs. Logs can be sent to 29 endpoints, including popular third-party services like Datadog, Sumo Logic, Splunk, and others where they can be monitored.

AWS CloudFront

AWS CloudFront real-time logs are integrated with Amazon Kinesis Data Streams to enable delivery using Amazon Kinesis Data Firehose. Kinesis Data Firehose can then deliver logs to Amazon S3, Amazon Redshift, Amazon Elasticsearch Service, as well as service providers like Datadog, New Relic, and Splunk. AWS charges for real-time logs in addition to charging for Kinesis Data Streams.

Our Take

More visibility into your data is always better, and Fastly’s free real-time log streaming is the clear winner here with more choice of endpoints, allowing customers to use the specialized third-party services they prefer. AWS encourages staying within the AWS ecosystem and penalizes customers for not using AWS services, namely their S3 object storage.

Integrations

Integrations allow you to extend a product or service’s functionality through add-ons. With your CDN, you might want to enhance it with a different interface or add on new features the original doesn’t include. One popular tool we’ll highlight here is Terraform, a tool that allows you to provision infrastructure as code (IaC).

Terraform

Both Fastly and AWS CloudFront support Terraform. Fastly has detailed instructions on its website about how to set this up and configure it to work seamlessly with the service.

Amazon’s AWS CloudFront allows you to integrate with Terraform by installing the program on your local machine and configuring it within AWS CloudFront’s configuration files.

The Drawbacks of a Closed Ecosystem

It’s important to note that AWS CloudFront, as an AWS product, works best with other AWS products, and doesn’t exactly play nice with competitor products. As an independent cloud services provider, Fastly is vendor agnostic and works with many other cloud providers, including AWS’s other products and Backblaze.

TLS (Transport Layer Security) Protocols

TLS or transport layer security (formerly known as secure sockets layer (SSL)) is an encryption device used to protect website data. Whenever you see the lock sign on your internet browser, you are using a website that is protected by an TLS (HTTPS).

Fastly assigns a shared domain name to your CDN content. You can use the associated TLS certificate for free or bring your own TLS certificate and install it. Fastly offers detailed instructions and help guides so you can securely configure your content.

Amazon’s AWS CloudFront also assigns a unique URL for your CDN content. You can use an Amazon-issued certificate, the default TLS certificate installed on the server or use your own TLS. If you use your own TLS, you must follow the explicit instructions for key length and install it correctly on the server.

Pricing

Fastly

Fastly offers a free trial which includes $50 of traffic with pay-as-you-go bandwidth pricing after that. Bandwidth pricing is based on geographic location and starts at, for example, $0.12 per GB for the first 10TB for North America. The next 10TB is $0.08 per GB, and they charge $0.0075 per 10,000 requests. Fastly also offers tiered capacity-based pricing for edge cloud services, starting with its Essential product for small businesses, which includes 3TB of global delivery per month. Their Professional tier includes 10TB of global delivery per month, and their Enterprise tier is unlimited. They also offer add-on products for security and distributed applications.

AWS CloudFront

AWS CloudFront offers a free plan including 1TB of data transfer out, 10,000,000 HTTP or HTTPS requests, and 2,000,000 functions invocations each month. However, customers needing more than the basic plan will have to consider the tiered pricing based on bandwidth usage. AWS CloudFront’s pricing starts at $0.085 per GB up to 10TB in North America. All told, there are seven pricing tiers from 10TB to >5PB.

Our Take

When it comes to content delivery, AWS CloudFront can’t compete on cost. Not only that, but Fastly’s pay-as-you-go pricing model with only two tiers is simpler than AWS CloudFront’s pricing with seven tiers. As with many AWS products, complexity demands configuration and management time. Customers tend to spend less time getting Fastly to work the way they want it to. With AWS CloudFront, customers also run the risk of getting locked in to the AWS ecosystem.

Our Final Take

Between the two CDNs, Fastly is the better choice for customers that rely on managing and serving dynamic content without paying high fees to create personalized experiences for their end users. Fastly wins over AWS CloudFront on a few key points:

More price competitive for content delivery
Simpler pricing tiers
Vendor agnostic
Better caching
Easier image optimization
Real-time log streaming
More expensive, but better performing out-of-the-box WAF

Using a CDN with Cloud Storage

A CDN can greatly speed up your website load times, but there will still be times when a request will call the origin store. Having reliable and affordable origin storage is key when the cache doesn’t have the content stored. When you pair a CDN with origin storage in the cloud, you get the benefit of both scalability and speed.

The post Fastly vs. AWS CloudFront: How Do the CDNs Stack Up? appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

How to Serve Data From a Private Bucket with a Cloudflare Worker

2023-01-18 Pat Patterson

Post Syndicated from Pat Patterson original https://www.backblaze.com/blog/how-to-serve-data-from-a-private-bucket-with-a-cloudflare-worker/

Customers storing data in Backblaze B2 Cloud Storage enjoy zero-cost downloads via our Content Delivery Network (CDN) partners: Cloudflare, Fastly, and Bunny.net. Configuring a CDN to proxy access to a Backblaze B2 Bucket is straightforward and improves the user experience, since the CDN caches data close to end-users. Ensuring that end-users can only access content via the CDN, and not directly from the bucket, requires a little more effort. A new technical article, Cloudflare Workers for Backblaze B2, provides the steps to serve content from Backblaze B2 via your own Cloudflare Worker.

In this blog post, I’ll explain why you might want to prevent direct downloads from your Backblaze B2 Bucket, and how you can use a Cloudflare Worker to do so.

Why Prevent Direct Downloads?

As mentioned above, Backblaze’s partnerships with CDN providers allow our customers to deliver content to end users with zero costs for data egress from Backblaze to the CDN. To illustrate why you might want to serve data to your end users exclusively through the CDN, let’s imagine you’re creating a website, storing your website’s images in a Backblaze B2 Bucket with public-read access, acme-images.

For the initial version, you build web pages with direct links to the images of the form https://acme-images.s3.us-west-001.backblazeb2.com/logos/acme.png. As users browse your site, their browsers will download images directly from Backblaze B2. Everything works just fine for users near the Backblaze data center hosting your bucket, but the further a user is from that data center, the longer it will take each image to appear on screen. No matter how fast the network connection, there’s no getting around the speed of light!

Aside from the degraded user experience, there are costs associated with end users downloading data directly from Backblaze. The first GB of data downloaded each day is free, then we charge $0.01 for each subsequent GB. Depending on your provider’s pricing plan, adding a CDN to your architecture can both reduce download costs and improve the user experience, as the CDN will transfer data through its own network and cache content close to end users. Another detail to note when comparing costs is that Backblaze and Cloudflare’s Bandwidth Alliance means that data flows from Backblaze to Cloudflare free of download charges, unlike data flowing from, for example, Amazon S3 to Cloudflare.

Typically, you need to set up a custom domain, say images.acme.com, that resolves to an IP address at the CDN. You then configure one or more origin servers or backends at the CDN with your Backblaze B2 Buckets’ S3 endpoints. In this example, we’ll use a single bucket, with endpoint acme-images.s3.us-west-001.backblazeb2.com, but you might use Cloud Replication to replicate content between buckets in multiple regions for greater resilience.

Now, after you update the image links in your web pages to the form https://images.acme.com/logos/acme.png, your users will enjoy an improved experience, and your operating costs will be reduced.

As you might have guessed, however, there is one chink in the armor. Clients can still download images directly from the Backblaze B2 Bucket, incurring charges on your Backblaze account. For example, users might have bookmarked or shared links to images in the bucket, or browsers or web crawlers might have cached those links.

The solution is to make the bucket private and create an edge function: a small piece of code running on the CDN infrastructure at the images.acme.com endpoint, with the ability to securely access the bucket.

Both Cloudflare and Fastly offer edge computing platforms; in this blog post, I’ll focus on Cloudflare Workers and cover Fastly Compute@Edge at a later date.

Proxying Backblaze B2 Downloads With a Cloudflare Worker

The blog post Use a Cloudflare Worker to Send Notifications on Backblaze B2 Events provides a brief introduction to Cloudflare Workers; here I’ll focus on how the Worker accesses the Backblaze B2 Bucket.

API clients, such as Workers, downloading data from a private Backblaze B2 Bucket via the Backblaze S3 Compatible API must digitally sign each request with a Backblaze Application Key ID (access key ID in AWS parlance) and Application Key (secret access key). On receiving a signed request, the Backblaze B2 service verifies the identity of the sender (authentication) and that the request was not changed in transit (integrity) before returning the requested data.

So when the Worker receives an unsigned HTTP request from an end user’s browser, it must sign it, forward it to Backblaze B2, and return the response to the browser. Here are the steps in more detail:

A user views a web page in their browser.
The user’s browser requests an image from the Cloudflare Worker.
The Worker makes a copy of the incoming request, changing the target host in the copy to the bucket endpoint, and signs the copy with its application key and key ID.
The Worker sends the signed request to Backblaze B2.
Backblaze B2 validates the signature, and processes the request.
Backblaze B2 returns the image to the Worker.
The Worker forwards the image to the user’s browser.

These steps are illustrated in the diagram below.

The signing process imposes minimal overhead, since GET requests have no payload. The Worker need not even read the incoming response payload into memory, instead returning the response from Backblaze B2 to the Cloudflare Workers framework to be streamed directly to the user’s browser.

Now you understand the use case, head over to our newly published technical article, Cloudflare Workers for Backblaze B2, and follow the steps to serve content from Backblaze B2 via your own Cloudflare Worker.

Put the Proxy to Work!

The Cloudflare Worker for Backblaze B2 can be used as-is to ensure that clients download files from one or more Backblaze B2 Buckets via Cloudflare, rather than directly from Backblaze B2. At the same time, it can be readily adapted for different requirements. For example, the Worker could verify that clients pass a shared secret in an HTTP header, or route requests to buckets in different data centers depending on the location of the edge server. The possibilities are endless.

How will you put the Cloudflare Worker for Backblaze B2 to work? Sign up for a Backblaze B2 account and get started!

The post How to Serve Data From a Private Bucket with a Cloudflare Worker appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Reduce origin load, save on cloud egress fees, and maximize cache hits with Cache Reserve

2022-11-15 Alex Krivit

Post Syndicated from Alex Krivit original https://blog.cloudflare.com/cache-reserve-open-beta/

Reduce origin load, save on cloud egress fees, and maximize cache hits with Cache Reserve

Earlier this year, we introduced Cache Reserve. Cache Reserve helps users serve content from Cloudflare’s cache for longer by using R2’s persistent data storage. Serving content from Cloudflare’s cache benefits website operators by reducing their bills for egress fees from origins, while also benefiting website visitors by having content load faster.

Cache Reserve has been in closed beta for a few months while we’ve collected feedback from our initial users and continued to develop the product. After several rounds of iterating on this feedback, today we’re extremely excited to announce that Cache Reserve is graduating to open beta – users will now be able to test it and integrate it into their content delivery strategy without any additional waiting.

If you want to see the benefits of Cache Reserve for yourself and give us some feedback– you can go to the Cloudflare dashboard, navigate to the Caching section and enable Cache Reserve by pushing one button.

How does Cache Reserve fit into the larger picture?

Content served from Cloudflare’s cache begins its journey at an origin server, where the content is hosted. When a request reaches the origin, the origin compiles the content needed for the response and sends it back to the visitor.

The distance between the visitor and the origin can affect the performance of the asset as it may travel a long distance for the response. This is also where the user is charged a fee to move the content from where it’s stored on the origin to the visitor requesting the content. These fees, known as “bandwidth” or “egress” fees, are familiar monthly line items on the invoices for users that host their content on cloud providers.

Cloudflare’s CDN sits between the origin and visitor and evaluates the origin’s response to see if it can be cached. If it can be added to Cloudflare’s cache, then the next time a request comes in for that content, Cloudflare can respond with the cached asset, which means there’s no need to send the request to the origin– reducing egress fees for our customers. We also cache content in data centers close to the visitor to improve the performance and cut down on the transit time for a response.

To help assets remain cached for longer, a few years ago we introduced Tiered Cache which organizes all of our 250+ global data centers into a hierarchy of lower-tiers (generally closer to visitors) and upper-tiers (generally closer to origins). When a request for content cannot be served from a lower-tier’s cache, the upper-tier is checked before going to the origin for a fresh copy of the content. Organizing our data centers into tiers helps us cache content in the right places for longer by putting multiple caches between the visitor’s request and the origin.

Why do cache misses occur?
Misses occur when Cloudflare cannot serve the content from cache and must go back to the origin to retrieve a fresh copy. This can happen when a customer sets the cache-control time to signify when the content is out of date (stale) and needs to be revalidated. The other element at play – how long the network wants content to remain cached – is a bit more complicated and can fluctuate depending on eviction criteria.

CDNs must consider whether they need to evict content early to optimize storage of other assets when cache space is full. At Cloudflare, we prioritize eviction based on how recently a piece of cached content was requested by using an algorithm called “least recently used” or LRU. This means that even if cache-control signifies that a piece of content should be cached for many days, we may still need to evict it earlier (if it is least-requested in that cache) to cache more popular content.

This works well for most customers and website visitors, but is often a point of confusion for people wondering why content is unexpectedly displaying a miss. If eviction did not happen then content would need to be cached in data centers that were further away from visitors requesting that data, harming the performance of the asset and injecting inefficiencies into how Cloudflare’s network operates.

Some customers, however, have large libraries of content that may not be requested for long periods of time. Using the traditional cache, these assets would likely be evicted and, if requested again, served from the origin. Keeping assets in cache requires that they remain popular on the Internet which is hard given what’s popular or current is constantly changing. Evicting content that becomes cold means additional origin egress for the customer if that content needs to be pulled repeatedly from the origin.

Enter Cache Reserve
This is where Cache Reserve shines. Cache Reserve serves as the ultimate upper-tier data center for content that might otherwise be evicted from cache. Once admitted to Cache Reserve, content can be stored for a much longer period of time– 30 days by default. If another request comes in during that period, it can be extended for another 30 days (and so on) or until cache-control signifies that we should no longer serve that content from cache. Cache Reserve serves as a safety net to backstop all cacheable content, so customers don’t have to worry about unwanted cache eviction and origin egress fees.

How does Cache Reserve save egress?

The promise of Cache Reserve is that hit ratios will increase and egress fees from origins will decrease for long tail content that is rarely requested and may be evicted from cache.

However, there are additional egress savings built into the product. For example, objects are written to Cache Reserve on misses. This means that when fetching the content from the origin on a cache miss, we both use that to respond to a request while also writing the asset to Cache Reserve, so customers won’t experience egress from serving that asset for a long time.

Cache Reserve is designed to be used with tiered cache enabled for maximum origin shielding. When there is a cache miss in both the lower and upper tiers, Cache Reserve is checked and if there is a hit, the response will be cached in both the lower and upper tier on its way back to the visitor without the origin needing to see the request or serve any additional data.

Cache Reserve accomplishes these origin egress savings for a low price, based on R2 costs. For more information on Cache Reserve prices and operations, please see the documentation here.

Scaling Cache Reserve on Cloudflare’s developer platform

When we first announced Cache Reserve, the response was overwhelming. Over 20,000 users wanted access to the beta, and we quickly made several interesting discoveries about how people wanted to use Cache Reserve.

The first big challenge we found was that users hated egress fees as much as we do and wanted to make sure that as much content as possible was in Cache Reserve. During the closed beta we saw usage above 8,000 PUT operations per second sustained, and objects served at a rate of over 3,000 GETs per second. We were also caching around 600Tb for some of our large customers. We knew that we wanted to open the product up to anyone that wanted to use it and in order to scale to meet this demand, we needed to make several changes quickly. So we turned to Cloudflare’s developer platform.

Cache Reserve stores data on R2 using its S3-compatible API. Under the hood, R2 handles all the complexity of an object storage system using our performant and scalable developer primitives: Workers and Durable Objects. We decided to use developer platform tools because it would allow us to implement different scaling strategies quickly. The advantage of building on the Cloudflare developer platform is that Cache Reserve was easily able to experiment to see how we could best distribute the high load we were seeing, all while shielding the complexity of how Cache Reserve works from users.

With the single press of a button, Cache Reserve performs these functions:

On a cache miss, Pingora (our new L7 proxy) reaches out to the origin for the content and writes the response to R2. This happens while the content continues its trip back to the visitor (thereby avoiding needless latency).
Inside R2, a Worker writes the content to R2’s persistent data storage while also keeping track of the important metadata that Pingora sends about the object (like origin headers, freshness values, and retention information) using Durable Objects storage.
When the content is next requested, Pingora looks up where the data is stored in R2 by computing the cache key. The cache key’s hash determines both the object name in R2 and which bucket it was written to, as each zone’s assets are sharded across multiple buckets to distribute load.
Once found, Pingora attaches the relevant metadata and sends the content from R2 to the nearest upper-tier to be cached, then to the lower-tier and finally back to the visitor.

This is magic! None of the above needs to be managed by the user. By bringing together R2, Workers, Durable Objects, Pingora, and Tiered Cache we were able to quickly build and make changes to Cache Reserve to scale as needed…

What’s next for Cache Reserve

In addition to the work we’ve done to scale Cache Reserve, opening the product up also opens the door to more features and integrations across Cloudflare. We plan on putting additional analytics and metrics in the hands of Cache Reserve users, so they know precisely what’s in Cache Reserve and how much egress it’s saving them. We also plan on building out more complex integrations with R2 so if customers want to begin managing their storage, they are able to easily make that transition. Finally, we’re going to be looking into providing more options for customers to control precisely what is eligible for Cache Reserve. These features represent just the beginning for how customers will control and customize their cache on Cloudflare.

What’s some of the feedback been so far?

As a long time Cloudflare customer, we were eager to deploy Cache Reserve to provide cost savings and improved performance for our end users. Ensuring our application always performs optimally for our global partners and delivery riders is a primary focus of Delivery Hero. With Cache Reserve our cache hit ratio improved by 5% enabling us to scale back our infrastructure and simplify what is needed to operate our global site and provide additional cost savings.
Wai Hang Tang, Director of Engineering at Delivery Hero

Anthology uses Cloudflare’s global cache to drastically improve the performance of content for our end users at schools and universities. By pushing a single button to enable Cache Reserve, we were able to provide a great experience for teachers and students and reduce two-thirds of our daily egress traffic.
Paul Pearcy, Senior Staff Engineer at Anthology

At Enjoei we’re always looking for ways to help make our end-user sites faster and more efficient. By using Cloudflare Cache Reserve, we were able to drastically improve our cache hit ratio by more than 10% which reduced our origin egress costs. Cache Reserve also improved the performance for many of our merchants’ sites in South America, which improved their SEO and discoverability across the Internet (Google, Criteo, Facebook, Tiktok)– and it took no time to set it up.
Elomar Correia, Head of DevOps SRE | Enterprise Solutions Architect at Enjoei

In the live events industry, the size and demand for our cacheable content can be extremely volatile, which causes unpredictable swings in our egress fees. Additionally, keeping data as close to our users as possible is critical for customer experience in the high traffic and low bandwidth scenarios our products are used in, such as conventions and music festivals. Cache Reserve helps us mitigate both of these problems with minimal impact on our engineering teams, giving us more predictable costs and lower latency than existing solutions.
Jarrett Hawrylak, VP of Engineering | Enterprise Ticketing at Patron Technology

How can I use it today?

As of today, Cache Reserve is in open beta, meaning that it’s available to anyone who wants to use it.

To use the Cache Reserve:

Simply go to the Caching tile in the dashboard.
Navigate to the Cache Reserve page and push the enable data sync button (or purchase button).

Enterprise Customers can work with their Cloudflare Account team to access Cache Reserve.

Customers can ensure Cache Reserve is working by looking at the baseline metrics regarding how much data is cached and how many operations we’ve seen in the Cache Reserve section of the dashboard. Specific requests served by Cache Reserve are available by using Logpush v2 and finding HTTP requests with the field “CacheReserveUsed.”

We will continue to make sure that we are quickly triaging the feedback you give us and making improvements to help ensure Cache Reserve is easy to use, massively beneficial, and your choice for reducing egress fees for cached content.

Try it out

We’ve been so excited to get Cache Reserve in more people’s hands. There will be more exciting developments to Cache Reserve as we continue to invest in giving you all the tools you need to build your perfect cache.

Try Cache Reserve today and let us know what you think.

Automatic (secure) transmission: taking the pain out of origin connection security

2022-10-03 Alex Krivit

Post Syndicated from Alex Krivit original https://blog.cloudflare.com/securing-origin-connectivity/

Automatic (secure) transmission: taking the pain out of origin connection security

In 2014, Cloudflare set out to encrypt the Internet by introducing Universal SSL. It made getting an SSL/TLS certificate free and easy at a time when doing so was neither free, nor easy. Overnight millions of websites had a secure connection between the user’s browser and Cloudflare.

But getting the connection encrypted from Cloudflare to the customer’s origin server was more complex. Since Cloudflare and all browsers supported SSL/TLS, the connection between the browser and Cloudflare could be instantly secured. But back in 2014 configuring an origin server with an SSL/TLS certificate was complex, expensive, and sometimes not even possible.

And so we relied on users to configure the best security level for their origin server. Later we added a service that detects and recommends the highest level of security for the connection between Cloudflare and the origin server. We also introduced free origin server certificates for customers who didn’t want to get a certificate elsewhere.

Today, we’re going even further. Cloudflare will shortly find the most secure connection possible to our customers’ origin servers and use it, automatically. Doing this correctly, at scale, while not breaking a customer’s service is very complicated. This blog post explains how we are automatically achieving that highest level of security possible for those customers who don’t want to spend time configuring their SSL/TLS set up manually.

Why configuring origin SSL automatically is so hard

When we announced Universal SSL, we knew the backend security of the connection between Cloudflare and the origin was a different and harder problem to solve.

In order to configure the tightest security, customers had to procure a certificate from a third party and upload it to their origin. Then they had to indicate to Cloudflare that we should use this certificate to verify the identity of the server while also indicating the connection security capabilities of their origin. This could be an expensive and tedious process. To help alleviate this high set up cost, in 2015 Cloudflare launched a beta Origin CA service in which we provided free limited-function certificates to customer origin servers. We also provided guidance on how to correctly configure and upload the certificates, so that secure connections between Cloudflare and a customer’s origin could be established quickly and easily.

What we discovered though, is that while this service was useful to customers, it still required a lot of configuration. We didn’t see the change we did with Universal SSL because customers still had to fight with their origins in order to upload certificates and test to make sure that they had configured everything correctly. And when you throw things like load balancers into the mix or servers mapped to different subdomains, handling server-side SSL/TLS gets even more complicated.

Around the same time as that announcement, Let’s Encrypt and other services began offering certificates as a public CA for free, making TLS easier and paving the way for widespread adoption. Let’s Encrypt and Cloudflare had come to the same conclusion: by offering certificates for free, simplifying server configuration for the user, and working to streamline certificate renewal, they could make a tangible impact on the overall security of the web.

The announcements of free and easy to configure certificates correlated with an increase in attention on origin-facing security. Cloudflare customers began requesting more documentation to configure origin-facing certificates and SSL/TLS communication that were performant and intuitive. In response, in 2016 we announced the GA of origin certificate authority to provide cheap and easy origin certificates along with guidance on how to best configure backend security for any website.

The increased customer demand and attention helped pave the way for additional features that focused on backend security on Cloudflare. For example, authenticated origin pull ensures that only HTTPS requests from Cloudflare will receive a response from your origin, preventing an origin response from requests outside of Cloudflare. Another option, Cloudflare Tunnel can be set up to run on the origin servers, proactively establishing secure and private tunnels to the nearest Cloudflare data center. This configuration allows customers to completely lock down their origin servers to only receive requests routed through our network. For customers unable to lock down their origins using this method, we still encourage adopting the strongest possible security when configuring how Cloudflare should connect to an origin server.

Cloudflare currently offers five options for SSL/TLS configurability that we use when communicating with origins:

In Off mode, as you might expect, traffic from browsers to Cloudflare and from Cloudflare to origins are not encrypted and will use plain text HTTP.
In Flexible mode, traffic from browsers to Cloudflare can be encrypted via HTTPS, but traffic from Cloudflare to the site’s origin server is not. This is a common selection for origins that cannot support TLS, even though we recommend upgrading this origin configuration wherever possible. A guide for upgrading can be found here.
In Full mode, Cloudflare follows whatever is happening with the browser request and uses that same option to connect to the origin. For example, if the browser uses HTTP to connect to Cloudflare, we’ll establish a connection with the origin over HTTP. If the browser uses HTTPS, we’ll use HTTPS to communicate with the origin; however we will not validate the certificate on the origin to prove the identity and trustworthiness of the server.
In Full (strict) mode, traffic between Cloudflare follows the same pattern as in Full mode, however Full (strict) mode adds validation of the origin server’s certificate. The origin certificate can either be issued by a public CA like Let’s Encrypt or by Cloudflare Origin CA.
In Strict mode, traffic from the browser to Cloudflare that is HTTP or HTTPS will always be connected to the origin over HTTPS with a validation of the origin server’s certificate.

What we have found in a lot of cases is that when customers initially signed up for Cloudflare, the origin they were using could not support the most advanced versions of encryption, resulting in origin-facing communication using unencrypted HTTP. These default values persisted over time, even though the origin has become more capable. We think the time is ripe to re-evaluate the entire concept of default SSL/TLS levels.

That’s why we will reduce the configuration burden for origin-facing security by automatically managing this on behalf of our customers. Cloudflare will provide a zero configuration option for how we will communicate with origins: we will simply look at an origin and use the most-secure option available to communicate with it.

Re-evaluating default SSL/TLS modes is only the beginning. Not only will we automatically upgrade sites to their best security setting, we will also open up all SSL/TLS modes to all plan levels. Historically, Strict mode was reserved for enterprise customers only. This was because we released this mode in 2014 when few people had origins that were able to communicate over SSL/TLS, and we were nervous about customers breaking their configurations. But this is 2022, and we think that Strict mode should be available to anyone who wants to use it. So we will be opening it up to everyone with the launch of the automatic upgrades.

How will automatic upgrading work?

To upgrade the origin-facing security of websites, we first need to determine the highest security level the origin can use. To make this determination, we will use the SSL/TLS Recommender tool that we released a year ago.

The recommender performs a series of requests from Cloudflare to the customer’s origin(s) to determine if the backend communication can be upgraded beyond what is currently configured. The recommender accomplishes this by:

Crawling the website to collect links on different pages of the site. For websites with large numbers of links, the recommender will only examine a subset. Similarly, for sites where the crawl turns up an insufficient number of links, we augment our results with a sample of links from recent visitors requests to the zone. All of this is to get a representative sample to where requests are going in order to know how responses are served from the origin.
The crawler uses the user agent Cloudflare-SSLDetector and has been added to Cloudflare’s list of known “good bots”.
Next, the recommender downloads the content of each link over both HTTP and HTTPS. The recommender makes only idempotent GET requests when scanning origin servers to avoid modifying server resource state.
Following this, the recommender runs a content similarity algorithm to determine if the content collected over HTTP and HTTPS matches.
If the content that is downloaded over HTTP matches the content downloaded over HTTPS, then it’s known that we can upgrade the security of the website without negative consequences.
If the website is already configured to Full mode, we will perform a certificate validation (without the additional need for crawling the site) to determine whether it can be updated to Full (strict) mode or higher.

If it can be determined that the customer’s origin is able to be upgraded without breaking, we will upgrade the origin-facing security automatically.

But that’s not all. Not only are we removing the configuration burden for services on Cloudflare, but we’re also providing more precise security settings by moving from per-zone SSL/TLS settings to per-origin SSL/TLS settings.

The current implementation of the backend SSL/TLS service is related to an entire website, which works well for those with a single origin. For those that have more complex setups however, this can mean that origin-facing security is defined by the lowest capable origin serving a part of the traffic for that service. For example, if a website uses img.example.com and api.example.com, and these subdomains are served by different origins that have different security capabilities, we would not want to limit the SSL/TLS capabilities of both subdomains to the least secure origin. By using our new service, we will be able to set per-origin security more precisely to allow us to maximize the security posture of each origin.

The goal of this is to maximize the origin-facing security of everything on Cloudflare. However, if any origin that we attempt to scan blocks the SSL recommender, has a non-functional origin, or opts-out of this service, we will not complete the scans and will not be able to upgrade security. Details on how to opt-out will be provided via email announcements soon.

Opting out

There are a number of reasons why someone might want to configure a lower-than-optimal security setting for their website. One common reason customers provide is a fear that having higher security settings will negatively impact the performance of their site. Others may want to set a suboptimal security setting for testing purposes or to debug some behavior. Whatever the reason, we will provide the tools needed to continue to configure the SSL/TLS mode you want, even if that’s different from what we think is the best.

When is this going to happen?

We will begin to roll this change out before the end of the year. If you read this and want to make sure you’re at the highest level of backend security already, we recommend Full (strict) or Strict mode. If you prefer to wait for us to automatically upgrade your origin security for you, please keep your eyes peeled to your inbox for the date we will begin rolling out this change for your group.

At Cloudflare, we believe that the Internet needs to be secure and private. If you’d like to help us achieve that, we’re hiring across the engineering organization.

Introducing Cache Rules: precision caching at your fingertips

2022-09-27 Alex Krivit

Post Syndicated from Alex Krivit original https://blog.cloudflare.com/introducing-cache-rules/

Introducing Cache Rules: precision caching at your fingertips

Ten years ago, in 2012, we released a product that put “a powerful new set of tools” in the hands of Cloudflare customers, allowing website owners to control how Cloudflare would cache, apply security controls, manipulate headers, implement redirects, and more on any page of their website. This product is called Page Rules and since its introduction, it has grown substantially in terms of popularity and functionality.

Page Rules are a common choice for customers that want to have fine-grained control over how Cloudflare should cache their content. There are more than 3.5 million caching Page Rules currently deployed that help websites customize their content. We have spent the last ten years learning how customers use those rules to cache content, and it’s clear the time is ripe for evolving rules-based caching on Cloudflare. This evolution will allow for greater flexibility in caching different types of content through additional rule configurability, while providing more visibility into when and how different rules interact across Cloudflare’s ecosystem.

Today, we’ve announced that Page Rules will be re-imagined into four product-specific rule sets: Origin Rules, Cache Rules, Configuration Rules, and Redirect Rules.

In this blog we’re going to discuss Cache Rules, and how we’re applying ten years of product iteration and learning from Page Rules to give you the tools and options to best optimize your cache.

Activating Page Rules, then and now

Adding a Page Rule is very simple: users either make an API call or navigate to the dashboard, enter a full or wildcard URL pattern (e.g. example.com/images/scr1.png or example.com/images/scr*), and tell us which actions to perform when we see that pattern. For example a Page Rule could tell browsers– keep a copy of the response longer via “Browser Cache TTL”, or tell our cache that via “Edge Cache TTL”. Low effort, high impact. All this is accomplished without fighting origin configuration or writing a single line of code.

Under the hood, a lot is happening to make that rule scale: we turn every rule condition into regexes, matching them against the tens of millions of requests per second across 275+ data centers globally. The compute necessary to process and apply new values on the fly across the globe is immense and corresponds directly to the number of rules we are able to offer to users. By moving cache actions from Page Rules to Cache Rules we can allow for users to not only set more rules, but also to trigger these rules more precisely.

More than a URL

Users of Page Rules are limited to specific URLs or URL patterns to define how browsers or Cloudflare cache their websites files. Cache Rules allows users to set caching behavior on additional criteria, such as the HTTP request headers or the requested file type. Users can continue to match on the requested URL also, as used in our Page Rules example earlier. With Cache Rules, users can now define this behavior on one or more fields available.

For example, if a user wanted to specify cache behavior for all image/png content-types, it’s now as simple as pushing a few buttons in the UI or writing a small expression in the API. Cache Rules give users precise control over when and how Cloudflare and browsers cache their content. Cache Rules allow for rules to be triggered on request header values that can be simply defined like

any(http.request.headers["content-type"][*] == "image/png")

Which triggers the Cache Rule to be applied to all image/png media types. Additionally, users may also leverage other request headers like cookie values, user-agents, or hostnames.

As a plus, these matching criteria can be stacked and configured with operators like AND and OR, providing additional simplicity in building complex rules from many discrete blocks, e.g. if you would like to target both image/png AND image/jpeg.

For the full list of fields available conditionals you can apply Cache Rules to, please refer to the Cache Rules documentation.

Visibility into how and when Rules are applied

Our current offerings of Page Rules, Workers, and Transform Rules can all manipulate caching functionality for our users’ content. Often, there is some trial and error required to make sure that the confluence of several rules and/or Workers are behaving in an expected manner.

As part of upgrading Page Rules we have separated it into four new products:

Origin Rues
Cache Rules
Configuration Rules
Redirect Rules

This gives users a better understanding into how and when different parts of the Cloudflare stack are activated, reducing the spin-up and debug time. We will also be providing additional visibility in the dashboard for when rules are activated as they go through Cloudflare. As a sneak peek please see:

Our users may take advantage of this strict precedence by chaining the results of one product into another. For example, the output of URL rewrites in Transform Rules will feed into the actions of Cache Rules, and the output of Cache Rules will feed into IP Access Rules, and so on.

In the future, we plan to increase this visibility further to allow for inputs and outputs across the rules products to be observed so that the modifications made on our network can be observed before the rule is even deployed.

Cache Rules. What are they? Are they improved? Let’s find out!

To start, Cache Rules will have all the caching functionality currently available in Page Rules. Users will be able to:

Tell Cloudflare to cache an asset or not,
Alter how long Cloudflare should cache an asset,
Alter how long a browser should cache an asset,
Define a custom cache key for an asset,
Configure how Cloudflare serves stale, revalidates, or otherwise uses header values to direct cache freshness and content continuity,

And so much more.

Cache Rules are intuitive and work similarly to our other ruleset engine-based products announced today: API or UI conditionals for URL or request headers are evaluated, and if matching, Cloudflare and browser caching options are configured on behalf of the user. For all the different options available, see our Cache Rules documentation.

Under the hood, Cache Rules apply targeted rule applications so that additional rules can be supported per user and across the whole engine. What this means for our users is that by consuming less CPU for rule evaluations, we’re able to support more rules per user. For specifics on how many additional Cache Rules you’ll be able to use, please see the Future of Rules Blog.

How can you use Cache Rules today?

Cache Rules are available today in beta and can be configured via the API, Terraform, or UI in the Caching tab of the dashboard. We welcome you to try the functionality and provide us feedback for how they are working or what additional features you’d like to see via community posts, or however else you generally get our attention 🙂.

If you have Page Rules implemented for caching on the same path, Cache Rules will take precedence by design. For our more patient users, we plan on releasing a one-click migration tool for Page Rules in the near future.

What’s in store for the future of Cache Rules?

In addition to granular control and increased visibility, the new rules products also opens the door to more complex features that can recommend rules to help customers achieve better cache hit ratios and reduce their egress costs, adding additional caching actions and visibility, so you can see precisely how Cache Rules will alter headers that Cloudflare uses to cache content, and allowing customers to run experiments with different rule configurations and see the outcome firsthand. These possibilities represent the tip of the iceberg for the next iteration of how customers will use rules on Cloudflare.

Try it out!

We look forward to you trying Cache Rules and providing feedback on what you’d like to see us build next.

IDC MarketScape positions Cloudflare as a Leader among worldwide Commercial CDN providers

2022-03-15 Vivek Ganti

Post Syndicated from Vivek Ganti original https://blog.cloudflare.com/idc-marketscape-cdn-leader-2022/

IDC MarketScape positions Cloudflare as a Leader among worldwide Commercial CDN providers

We are thrilled to announce that Cloudflare has been positioned in the Leaders category in the IDC MarketScape: Worldwide Commercial CDN 2022 Vendor Assessment(doc #US47652821, March 2022).

You can download a complimentary copy here.

The IDC MarketScape evaluated 10 CDN vendors based on their current capabilities and future strategies for delivering Commercial CDN services. Cloudflare is recognized as a Leader.

At Cloudflare, we release products at a dizzying pace. When we talk to our customers, we hear again and again that they appreciate Cloudflare for our relentless innovation. In 2021 alone, over the course of seven Innovation Weeks, we launched a diverse set of products and services that made our customers’ experiences on the Internet even faster, more secure, more reliable, and more private.

We leverage economies of scale and network effects to innovate at a fast pace. Of course, there’s more to our secret sauce than our pace of innovation. In the report, IDC notes that Cloudflare is “a highly innovative vendor and continues to invest in its competencies to support advanced technologies such as virtualization, serverless, AI/ML, IoT, HTTP3, 5G and (mobile) edge computing.” In addition, IDC also recognizes Cloudflare for its “integrated SASE offering (that) is appealing to global enterprise customers.”

Built for the modern Internet

Building fast scalable applications on the modern Internet requires more than just caching static content on servers around the world. Developers need to be able to build applications without worrying about underlying infrastructure. A few years ago, we set out to revolutionize the way applications are built, so developers didn’t have to worry about scale, speed, or even compliance. Our goal was to let them build the code, while we handle the rest. Our serverless platform, Cloudflare Workers, aimed to be the easiest, most powerful, and most customizable platform for developers to build and deploy their applications.

Workers was designed from the ground up for an edge-first serverless model. Since Cloudflare started with a distributed edge network, rather than trying to push compute from large centralized data centers out into the edge, working under those constraints forced us to innovate.

Today, Workers services hundreds of thousands of developers, ranging from hobbyists to enterprises all over the world, serving millions of requests per second.

According to the IDC MarketScape: “The Cloudflare Workers developer platform, based on an isolate serverless architecture, is highly customizable and provides customers with a shortened time to market which is crucial in this digitally led market.”

Data you care about, at your fingertips

Our customers today have access to extensive analytics on the dashboard and via the API around network performance, firewall actions, cache ratios, and more. We provide analytics based on raw events, which means that we go beyond simple metrics and provide powerful filtering and analysis capabilities on high-dimensionality data.

And our insights are actionable. For example, customers who are looking to optimize cache performance can analyze specific URLs and see not just hits and misses but content that is expired or revalidated (indicating a short URL). All events, both directly in the console and in the logs, are available within 30 seconds or less.

The IDC MarketScape notes that the “self-serve portal and capabilities that include dashboards with detailed analytics as well as actionable content delivery and security analytics are complemented by a comprehensive enhanced services suite for enterprise grade customers.”

The only unified SASE solution in the market

Cloudflare’s SASE offering, Cloudflare One, continues to grow and provides a unified and comprehensive solution to our customers. With our SASE offering, we aim to be the network that any business can plug into to deliver the fastest, most reliable, and most secure experiences to their customers. Cloudflare One combines network connectivity services with Zero Trust security services on our purpose-built global network.

Cloudflare Access and Gateway services natively work together to secure connectivity for any user to any application and Internet destination. To enhance threat and data protection, Cloudflare Browser Isolation and CASB services natively work across both Access and Gateway to fully control data in transit, at rest, and in use.

The old model of the corporate network has been made obsolete by mobile, SaaS, and the public cloud. We believe Zero Trust networking is the future, and we are proud to be enabling that future. The IDC MarketScape notes: “Cloudflare’s enterprise security Zero Trust services stack is extensive and meets secure access requirements of the distributed workforce. Its data localization suite and integrated SASE offering is appealing to global enterprise customers.“

A one-stop, truly global solution

Many global companies today looking to do business in China often are restricted in their operations due to the country’s unique regulatory, political, and trade policies.

Core to Cloudflare’s mission of helping build a better Internet is making it easy for our customers to improve the performance, security, and reliability of their digital properties, no matter where in the world they might be, and this includes mainland China. Our partnership with JD Cloud & AI allows international businesses to grow their online presence in China without having to worry about managing separate tools with separate vendors for security and performance in China.

Just last year, we made advancements to allow customers to be able to serve their DNS in mainland China. This means DNS queries are answered directly from one of the JD Cloud Points of Presence (PoPs), leading to faster response times and improved reliability. This in addition to providing DDoS protection, WAF, serverless compute, SSL/TLS, and caching services from more than 35 locations in mainland China.

Here’s what the IDC MarketScape noted about Cloudflare’s China network: “Cloudflare’s strategic partnership with JD Cloud enables the vendor to provide its customers cached content in-country at any of their China data centers from origins outside of mainland China and provide the same Internet performance, security, and reliability experience in China as the rest of the world.”

A unified network that is fast, secure, reliable, customizable, and global

One of the earliest architectural decisions we made was to run the same software stack of our services across our ever-growing fleet of servers and data centers. So whether it is content caching, serverless compute, zero trust functionality, or our other performance, security, or reliability services, we run them from all of our physical points of presence. This also translates into faster performance and robust security policies for our customers, all managed from the same dashboard or APIs. This strategy has been a key enabler for us to expand our customer base significantly over the years. Today, Cloudflare’s network spans 250 cities across 100+ countries and has millions of customers, of which more than 140,000 are paying customers.

In the IDC MarketScape: Worldwide Commercial CDN 2022 Vendor Assessment, IDC notes, “[Cloudflare’s] clear strategy to invest in new technology but also expand its network as well as its sales machine across these new territories has resulted in a tremendous growth curve in the past years.”

To that, we’d humbly like to say that we are just getting started.

Stay tuned for more product and feature announcements on our blog. If you’re interested in contributing to Cloudflare’s mission, we’d love to hear from you.

Introducing: Smarter Tiered Cache Topology Generation

2021-02-18 Alex Krivit

Post Syndicated from Alex Krivit original https://blog.cloudflare.com/introducing-smarter-tiered-cache-topology-generation/

Introducing: Smarter Tiered Cache Topology Generation

Caching is a magic trick. Instead of a customer’s origin responding to every request, Cloudflare’s 200+ data centers around the world respond with content that is cached geographically close to visitors. This dramatically improves the load performance for web pages while decreasing the bandwidth costs by having Cloudflare respond to a request with cached content.

However, if content is not in cache, Cloudflare data centers must contact the origin server to receive the content. This isn’t as fast as delivering content from cache. It also places load on an origin server, and is more costly compared to serving directly from cache. These issues can be amplified depending on the geographic distribution of a website’s visitors, the number of data centers contacting the origin, and the available origin resources for responding to requests.

To decrease the number of times our network of data centers communicate with an origin, we organize data centers into tiers so that only upper-tier data centers can request content from an origin and then they spread content to lower tiers. This means content that loads faster for visitors, is cheaper to serve, and reduces origin resource consumption.

Today, I’m thrilled to announce a fundamental improvement to Argo Tiered Cache we’re calling Smart Tiered Cache Topology. When enabled, Argo Tiered Cache will now dynamically select the single best upper tier for each of your website’s origins while providing tiered cache analytics showing how your custom topology is performing.

Smarter Tiered Cache Topology Generation

Tiered Cache is part of Argo, a constellation of products that analyzes and optimizes routing decisions across the global Internet in real-time by processing information from every Cloudflare request to determine which routes to an origin are fast, which are slow, and what the optimum path from visitor to content is at any given moment. Previously, Argo Tiered Cache would use a static collection of upper-tier data centers to communicate with the origin. With the improvements we’re announcing today, Tiered Cache can now dynamically find the single best upper tier for an origin using Argo performance and routing data. When Argo is enabled and a request for particular content is made, we collect latency data for each request to pick the optimal path. Using this latency data, we can determine how well any upper-tier data center is connected with an origin and can empirically select the best data center with the lowest latency to be the upper tier for an origin.

Argo Tiered Cache

Taking one step back, tiered caching is a practice where Cloudflare’s network of global data centers are subdivided into a hierarchy of upper tiers and lower tiers. In order to control bandwidth and number of connections between an origin and Cloudflare, only upper tiers are permitted to request content from an origin and must propagate that information to the lower tiers. In this way, Cloudflare data centers first talk to each other to find content before asking the origin. This practice improves bandwidth efficiency by limiting the number of data centers that can ask the origin for content, reduces origin load, and makes websites more cost-effective to operate. Argo Tiered Cache customers only pay for data transfer between the client and edge, and we take care of the rest. Tiered caching also allows for improved performance for visitors, because distances and links traversed between Cloudflare data centers are generally shorter and faster than the links between data centers and origins.

Previously, when Argo Tiered Cache was enabled for a website, several of Cloudflare’s largest and most-connected data centers were determined to be upper tiers and could pull content from an origin on a cache MISS. While utilizing a topology consisting of numerous upper-tier data centers may be globally performant, we found that cost-sensitive customers generally wanted to find the single best upper tier for their origin to ensure efficient data transfer of their content to Cloudflare’s network. We built Smart Tiered Cache Topology for this reason.

How to enable Smart Tiered Cache Topology

When you enable Argo Tiered Cache, Cloudflare now by default concentrates connections to origin servers so they come from a single data center. This is done without needing to work with our Customer Success or Solutions Engineering organization to custom configure the best single upper tier. Argo customers can generate this topology by:

Logging into your Cloudflare account.
Navigating to the Traffic tab in the dashboard.
Ensuring you have Argo enabled.
From there, Non-Enterprise Argo customers will automatically be enrolled in Smart Tiered Cache Topology without needing to make any additional changes.

Enterprise customers can select the type of topology they’d like to generate.

Self-serve Argo customers are automatically enrolled in Smart Tiered Cache Topology

Enterprise customers can determine the tiered cache topology that works best for them.

More data, fewer problems

Once enabled, in addition to performance and cost improvements, Smart Tiered Cache Topology also delivers summary analytics for how the upper tiers are performing so that you can monitor the cost and performance benefits your website is receiving. These analytics are available in the Cache Tab of the dashboard in the Tiered Cache section. The “Primary Data Center” and “Secondary Data Center” fields show you which data centers were determined to be the best upper tier and failover for your origin. “Cached Hits” and the “Hit Ratio” shows you the proportion of requests that were served by the upper tier and how many needed to be forwarded on to the origin for a response. “Bytes Saved” indicates the total transfer from the upper-tier data center to the lower tiers, showing the total bandwidth saved by having Cloudflare’s lower tier data centers ask the upper tiers for the content instead of the origin.

Smart Tiered Cache Topology works with Cloudflare’s existing products to deliver you a seamless, easy, and performant experience that saves you money and provides you useful information about how your upper tiers are working with your origins. Smart Tiered Cache Topology stands on the shoulders of some of the most resilient and useful products at Cloudflare to provide even more benefits to webmasters.

If you’re interested in seeing how Argo and Smart Tiered Cache Topology can benefit your web property, please login to your Cloudflare account and find more information in the Traffic tab of the dashboard here.

Tiered Cache Smart Topology

2021-02-18 Brian Bradley

Post Syndicated from Brian Bradley original https://blog.cloudflare.com/tiered-cache-smart-topology/

Tiered Cache Smart Topology

A few years ago, we released Argo to help make the Internet faster and more efficient. Argo observes network conditions and finds the optimal route across the Internet for origin server requests, avoiding congestion along the way.

Tiered Cache is an Argo feature that reduces the number of data centers responsible for requesting assets from the origin. With Tiered Cache active, a request in South Africa won’t go directly to an origin in North America, but, instead, look in a large, nearby data center to see if the data requested is cached there first. The number and location of the data centers used by Tiered Cache is controlled by a piece of configuration called the topology. By default, we use a generic topology for every customer that strikes a balance between cache hit ratios and latency that is suitable for most users.

Today we’re introducing Smart Topology, which maximizes cache hit ratios by building on Argo’s internal infrastructure to identify the single best data center for making requests to the origin.

Standard Cache

The standard method for caching assets is to let each data center be a reverse proxy for the origin server. In this scheme, a miss in any data center causes a request to the origin for an asset. A request to the origin for one asset could be made as many times as there are data centers.

A cache miss in any data center will result in a request being sent to the origin server even if the asset is cached in some other data center. This is because the data centers are completely oblivious of each other.

Theoretically, a request for the asset would have to be sent to every data center in order to reduce the cache misses to the minimum possible. However, sending every request to every data center is not practical.

The minimum possible cache hit latency is achieved if the asset is moved into the nearest cache before the request for it is made, but this kind of prediction is generally not possible. Instead, a good heuristic is to move the asset into the nearest cache after the first cache miss.

However, the asset has to be copied from somewhere and it isn’t possible to know where in the network it might be without querying each data center.

To avoid querying each data center, a copy of the asset has to be stored in a known location after the first cache miss so it is available to other data centers. This is precisely what Tiered Cache does.

Tiered Cache

Tiered Cache improves cache hit ratios by allowing some data centers to serve as caches for others, before the latter has to make a request to the origin. With Tiered Cache, certain data centers are reverse proxies to the origin for other data centers.

If the proxied data centers make requests for the same asset, the asset will already be cached in the proxying data center and can be retrieved from there rather than from the origin. Fewer requests to the origin are made overall.

Custom Topology

In Tiered Cache, the topology describes which data center should serve as a proxy for others.

For customers, devising an optimal topology is a challenge requiring optimization and continuous maintenance. The best topology is a configuration based on information that is privately held by the customer and other information held only by Cloudflare.

For instance, knowing the desired balance of latency versus cache hit ratio is information only the customer has, but how to best make use of the Internet is something we would know. Enterprise customers usually have dedicated infrastructure teams that work with our solutions engineers to manually optimize and maintain their tiered cache topology.

Not every customer would want to personalize their topology. For this reason a generic topology exists.

Generic Topology

The generic topology is designed to achieve good latency and cache efficiency for any origin, regardless of location. A balance is struck between two constraints — cache efficiency and latency.

The generic topology has multiple proxying data centers that are distributed around the world in order to ensure that requests that result in a cache miss do not take a very long detour before going to the origin. There is a balance between the number of proxying data centers and the cache hit ratio, because the proxying data centers are oblivious to each other.

If a proxying data center is taken offline, the proxied data centers either use a fallback (if the fallback is online) or revert to behaving like Tiered Cache is disabled.

To achieve the best balance for general usage, the generic topology instructs the smaller data centers to be proxied by the larger data centers in the same geographic region.

Smart Topology

Smart Topology assumes the origin is in one place and then automatically configures itself to be optimal once the customer just flips a switch in the dashboard. In order to actually do this, Cloudflare needs to be able to determine which data center has the lowest latency to the origin without making the customer tell Cloudflare where the origin is.

Methods for Latency Determination

There are a few ways to determine which data center has the lowest latency with respect to the origin.

IP geolocation
Physical distance can be used as an approximation for latency, but Smart Topology was not built this way for a couple of reasons. First, even the best commercial IP geo database doesn’t have the required coverage and accuracy. Second, even with perfect accuracy, physical distance is a questionable approximation of Internet latency.

Probing
Latency to an IP address can be determined exactly by probing that address. The probe can just be the time required to perform the TCP handshake. Each data center probes the origin so that the latencies can be directly measured and the minimum can be found. Except for edge cases involving Anycast and TCP termination, we can assume that the latency to an IP address is the same as the latency to the origin server behind that IP address.

Topology Selection Algorithm

The goal of the topology selection algorithm is to minimize cache misses and latency. The topology chooses a single proxying data center in order to maximize the cache hit ratio. The proxying data center is chosen to be close to the origin so that the latencies of cache misses in the proxied data centers are not much worse than they would be with tiered cache turned off.

The choice should eventually become stable. Stability is important because each time the choice changes, cache misses in proxied data centers are likely to cause cache misses in the new proxying data center. Capacity is important because when a data center goes offline, it can cause a large number of cache misses. Minimizing latency to the origin is important to ensure that the network is used efficiently.

The data center selection algorithm is rather like a leaderboard of the fastest data center for each origin. As data is collected, a faster data center can knock others off a given origin’s leaderboard. This competition is based on the 24 hour median latency and is held each hour. Only a subset of data centers deemed large enough are permitted to compete.

Eventually, the choice for proxying data centers becomes stable. Over time, data centers produce competing records for each origin and less competitive records in the leaderboard are replaced as necessary. Thus, latencies for any origin on the leaderboard can only monotonically decrease. There are always physical limits in the real world, so eventually the ideal data center will set a record that is too good to beat.

Also, the leaderboard actually includes both the lowest latency data center and the second lowest latency data center. The second lowest latency data center serves as a fallback if the preferred data center is taken offline for maintenance.

Anycast Networks
We are measuring the latency to the origin IP address and assuming that it represents the latency to the origin server, but this can break down in certain cases. A few cloud providers other than Cloudflare also use Anycast technology to provide their services. In Anycast, multiple machines can share an IP address regardless of where they are connected to the Internet, and the Internet will typically route packets destined for that address to the closest machine. If an Anycast network is used to proxy an origin server, then the apparent latency to the IP address for the origin server is actually the latency to the edge of the Anycast network rather than the latency to the origin server. The real latency to the origin server cannot be determined by probing.

The algorithm would fail to select the single best proxying data center if the latencies are not representative of the actual latency between data center and origin. Selecting the wrong data center would adversely affect latencies for requests to the origin, and could be expensive.

For instance, imagine a cloud provider provides an IP address that actually routes to multiple data centers all over the world. Packets are routed through private infrastructure to the correct destination once they enter the network. The lowest latency data center to this Anycast IP address could potentially even be on a different continent than the actual origin server. Therefore, the apparent latency cannot actually be trusted as a true measure of actual latency to the origin.

The data center selection algorithm assumes that the origin is in a single geographic location and can be probed to determine latency from each data center. These networks break one or both of these assumptions, so a procedure had to be developed in order to detect them. First, it is assumed that the IP appears in a single geographic location and is not proxied by such a network. The latency to the origin is bounded by the speed of light through fiber. Although the distance between any data center and the origin server is not known, the distances between data centers is known by Cloudflare.

Imagine putting the origin server as a pitstop in that journey. Then, the theoretical minimum possible observable pair of latencies between the origin server and any two data centers can be computed. We have the latency probe data from both of these data centers and the origin, so we can check to see whether the observed latency is lower than what is possible.

The original assumption was that the origin IP address identifies an origin server that is in one location and the latency to that IP address is the latency to the origin server. If the observed latencies are faster than light then clearly the assumption is false. Smart Topology falls back to the generic topology when the original assumption does not hold. To be extra sure, we check this constraint on a bunch of data centers around the world and fall back if there is even a single physically impossible observation.

The Big Picture

When Smart Topology is enabled many Cloudflare systems work together to ensure the correct data center is eventually used to request assets from the origin.

When the customer enables Tiered Cache Smart Topology, one of a few things can happen from the perspective of the origin. If a proxying data center has already been assigned to the CIDR block that encompasses the origin IP, the preferred or fallback data center is used to request assets from the origin. Otherwise, the generic topology is used to determine which proxying data centers to use to pull assets from the origin. The latency to the proxying data center should only decrease as the choice for proxying data center is updated over time.

Conclusion

Developing this technology offered a lot of opportunities to exercise great engineering and build an impactful product. It was not done in a vacuum; we used infrastructure that Cloudflare had already built, and we moved along that exponential gradient of using existing progress to make more progress. Building this framework opens a lot of doors to future progress too; for instance, in the future, we can explore ways to select the ideal proxying data center even for origins behind Anycast networks that hide the true latency to the origin.

An introduction to three-phase power and PDUs

2020-12-04 Rob Dinh

Post Syndicated from Rob Dinh original https://blog.cloudflare.com/an-introduction-to-three-phase-power-and-pdus/

An introduction to three-phase power and PDUs

Our fleet of over 200 locations comprises various generations of servers and routers. And with the ever changing landscape of services and computing demands, it’s imperative that we manage power in our data centers right. This blog is a brief Electrical Engineering 101 session going over specifically how power distribution units (PDU) work, along with some good practices on how we use them. It appears to me that we could all use a bit more knowledge on this topic, and more love and appreciation of something that’s critical but usually taken for granted, like hot showers and opposable thumbs.

A PDU is a device used in data centers to distribute power to multiple rack-mounted machines. It’s an industrial grade power strip typically designed to power an average consumption of about seven US households. Advanced models have monitoring features and can be accessed via SSH or webGUI to turn on and off power outlets. How we choose a PDU depends on what country the data center is and what it provides in terms of voltage, phase, and plug type.

An introduction to three-phase power and PDUs

For each of our racks, all of our dual power-supply (PSU) servers are cabled to one of the two vertically mounted PDUs. As shown in the picture above, one PDU feeds a server’s PSU via a red cable, and the other PDU feeds that server’s other PSU via a blue cable. This is to ensure we have power redundancy maximizing our service uptime; in case one of the PDUs or server PSUs fail, the redundant power feed will be available keeping the server alive.

Faraday’s Law and Ohm’s Law

Like most high-voltage applications, PDUs and servers are designed to use AC power. Meaning voltage and current aren’t constant — they’re sine waves with magnitudes that can alternate between positive and negative at a certain frequency. For example, a voltage feed of 100V is not constantly at 100V, but it bounces between 100V and -100V like a sine wave. One complete sine wave cycle is one phase of 360 degrees, and running at 50Hz means there are 50 cycles per second.

The sine wave can be explained by Faraday’s Law and by looking at how an AC power generator works. Faraday’s Law tells us that a current is induced to flow due to a changing magnetic field. Below illustrates a simple generator with a permanent magnet rotating at constant speed and a coil coming in contact with the magnet’s magnetic field. Magnetic force is strongest at the North and South ends of the magnet. So as it rotates on itself near the coil, current flow fluctuates in the coil. One complete rotation of the magnet represents one phase. As the North end approaches the coil, current increases from zero. Once the North end leaves, current decreases to zero. The South end in turn approaches, now the current “increases” in the opposite direction. And finishing the phase, the South end leaves returning the current back to zero. Current alternates its direction at every half cycle, hence the naming of Alternating Current.

Current and voltage in AC power fluctuate in-phase, or “in tandem”, with each other. So by Ohm’s Law of Power = Voltage x Current, power will always be positive. Notice on the graph below that AC power (Watts) has two peaks per cycle. But for practical purposes, we’d like to use a constant power value. We do that by interpreting AC power into “DC” power using root-mean-square (RMS) averaging, which takes the max value and divides it by √2. For example, in the US, our conditions are 208V 24A at 60Hz. When we look at spec sheets, all of these values can be assumed as RMS’d into their constant DC equivalent values. When we say we’re fully utilizing a PDU’s max capacity of 5kW, it actually means that the power consumption of our machines bounces between 0 and 7.1kW (5kW x √2).

It’s also critical to figure out the sum of power our servers will need in a rack so that it falls under the PDU’s design max power capacity. For our US example, a PDU is typically 5kW (208 volts x 24 amps); therefore, we’re budgeting 5kW and fit as many machines as we can under that. If we need more machines and the total sum power goes above 5kW, we’d need to provision another power source. That would lead to possibly another set of PDUs and racks that we may not fully use depending on demand; e.g. more underutilized costs. All we can do is abide by P = V x I.

However there is a way we can increase the max power capacity economically — 3-phase PDU. Compared to single phase, its max capacity is √3 or 1.7 times higher. A 3-phase PDU of the same US specs above has a capacity of 8.6kW (5kW x √3), allowing us to power more machines under the same source. Using a 3-phase setup might mean it has thicker cables and bigger plug; but despite being more expensive than a 1-phase, its value is higher compared to two 1-phase rack setups for these reasons:

It’s more cost-effective, because there are fewer hardware resources to buy
Say the computing demand adds up to 215kW of hardware, we would need 25 3-phase racks compared to 43 1-phase racks.
Each rack needs two PDUs for power redundancy. Using the example above, we would need 50 3-phase PDUs compared to 86 1-phase PDUs to power 215kW worth of hardware.
That also means a smaller rack footprint and fewer power sources provided and charged by the data center, saving us up to √3 or 1.7 times in opex.
It’s more resilient, because there are more circuit breakers in a 3-phase PDU — one more than in a 1-phase. For example, a 48-outlet PDU that is 1-phase would be split into two circuits of 24 outlets. While a 3-phase one would be split into 3 circuits of 16 outlets. If a breaker tripped, we’d lose 16 outlets using a 3-phase PDU instead of 24.

The PDU shown above is a 3-phase model of 48 outlets. We can see three pairs of circuit breakers for the three branches that are intertwined with each other — white, grey, and black. Industry demands today pressure engineers to maximize compute performance and minimize physical footprint, making the 3-phase PDU a widely-used part of operating a data center.

What is 3-phase?

A 3-phase AC generator has three coils instead of one where the coils are 120 degrees apart inside the cylindrical core, as shown in the figure below. Just like the 1-phase generator, current flow is induced by the rotation of the magnet thus creating power from each coil sequentially at every one-third of the magnet’s rotation cycle. In other words, we’re generating three 1-phase power offset by 120 degrees.

A 3-phase feed is set up by joining any of its three coils into line pairs. L1, L2, and L3 coils are live wires with each on their own phase carrying their own phase voltage and phase current. Two phases joining together form one line carrying a common line voltage and line current. L1 and L2 phase voltages create the L1/L2 line voltage. L2 and L3 phase voltages create the L2/L3 line voltage. L1 and L3 phase voltages create the L1/L3 line voltage.

Let’s take a moment to clarify the terminology. Some other sources may refer to line voltage (or current) as line-to-line or phase-to-phase voltage (or current). It can get confusing, because line voltage is the same as phase voltage in 1-phase circuits, as there’s only one phase. Also, the magnitude of the line voltage is equal to the magnitude of the phase voltage in 3-phase Delta circuits, while the magnitude of the line current is equal to the magnitude of the phase current in 3-phase Wye circuits.

Conversely, the line current equals to phase current times √3 in Delta circuits. In Wye circuits, the line voltage equals to phase voltage times √3.

In Delta circuits:
V_line = V_phase
I_{line = √3 x I_phase}

In Wye circuits:
V_line = √3 x V_phase
I_line = I_phase

Delta and Wye circuits are the two methods that three wires can join together. This happens both at the power source with three coils and at the PDU end with three branches of outlets. Note that the generator and the PDU don’t need to match each other’s circuit types.

On PDUs, these phases join when we plug servers into the outlets. So we conceptually use the wirings of coils above and replace them with resistors to represent servers. Below is a simplified wiring diagram of a 3-phase Delta PDU showing the three line pairs as three modular branches. Each branch carries two phase currents and its own one common voltage drop.

And this one below is of a 3-phase Wye PDU. Note that Wye circuits have an additional line known as the neutral line where all three phases meet at one point. Here each branch carries one phase and a neutral line, therefore one common current. The neutral line isn’t considered as one of the phases.

Thanks to a neutral line, a Wye PDU can offer a second voltage source that is √3 times lower for smaller devices, like laptops or monitors. Common voltages for Wye PDUs are 230V/400V or 120V/208V, particularly in North America.

Where does the √3 come from?

Why are we multiplying by √3? As the name implies, we are adding phasors. Phasors are complex numbers representing sine wave functions. Adding phasors is like adding vectors. Say your GPS tells you to walk 1 mile East (vector a), then walk a 1 mile North (vector b). You walked 2 miles, but you only moved by 1.4 miles NE from the original location (vector a+b). That 1.4 miles of “work” is what we want.

Let’s take in our application L1 and L2 in a Delta circuit. we add phases L1 and L2, we get a L1/L2 line. We assume the 2 coils are identical. Let’s say α represents the voltage magnitude for each phase. The 2 phases are 120 degrees offset as designed in the 3-phase power generator:

|L1| = |L2| = α
L1 = |L1|∠0° = α∠0°
L2 = |L2|∠-120° = α∠-120°

Using vector addition to solve for L1/L2:

L1/L2 = L1 + L2

Convert L1/L2 into polar form:

Since voltage is a scalar, we’re only interested in the “work”:

|L1/L2| = √3α

Given that α also applies for L3. This means for any of the three line pairs, we multiply the phase voltage by √3 to calculate the line voltage.

V_line = √3 x V_phase

Now with the three line powers being equal, we can add them all to get the overall effective power. The derivation below works for both Delta and Wye circuits.

P_overall = 3 x P_line
P_overall = 3 x (V_line x I_line)
P_overall = (3/√3) x (V_phase x I_phase)
P_overall = √3 x V_phase x I_phase

Using the US example, V_phase is 208V and I_phase is 24A. This leads to the overall 3-phase power to be 8646W (√3 x 208V x 24A) or 8.6kW. There lies the biggest advantage for using 3-phase systems. Adding 2 sets of coils and wires (ignoring the neutral wire), we’ve turned a generator that can produce √3 or 1.7 times more power!

Dealing with 3-phase

The derivation in the section above assumes that the magnitude at all three phases is equal, but we know in practice that’s not always the case. In fact, it’s barely ever. We rarely have servers and switches evenly distributed across all three branches on a PDU. Each machine may have different loads and different specs, so power could be wildly different, potentially causing a dramatic phase imbalance. Having a heavily imbalanced setup could potentially hinder the PDU’s available capacity.

A perfectly balanced and fully utilized PDU at 8.6kW means that each of its three branches has 2.88kW of power consumed by machines. Laid out simply, it’s spread 2.88 + 2.88 + 2.88. This is the best case scenario. If we were to take 1kW worth of machines out of one branch, spreading power to 2.88 + 1.88 + 2.88. Imbalance is introduced, the PDU is underutilized, but we’re fine. However, if we were to put back that 1kW into another branch — like 3.88 + 1.88 + 2.88 — the PDU is over capacity, even though the sum is still 8.6kW. In fact, it would be over capacity even if you just added 500W instead of 1kW on the wrong branch, thus reaching 3.18 + 1.88 + 2.88 (8.1kW).

That’s because a 8.6kW PDU is spec’d to have a maximum of 24A for each phase current. Overloading one of the branches can force phase currents to go over 24A. Theoretically, we can reach the PDU’s capacity by loading one branch until its current reaches 24A and leave the other two branches unused. That’ll render it into a 1-phase PDU, losing the benefit of the √3 multiplier. In reality, the branch would have fuses rated less than 24A (usually 20A) to ensure we won’t reach that high and cause overcurrent issues. Therefore the same 8.6kW PDU would have one of its branches tripped at 4.2kW (208V x 20A).

Loading up one branch is the easiest way to overload the PDU. Being heavily imbalanced significantly lowers PDU capacity and increases risk of failure. To help minimize that, we must:

Ensure that total power consumption of all machines is under the PDU’s max power capacity
Try to be as phase-balanced as possible by spreading cabling evenly across the three branches
Ensure that the sum of phase currents from powered machines at each branch is under the fuse rating at the circuit breaker.

This spreadsheet from Raritan is very useful when designing racks.

For the sake of simplicity, let’s ignore other machines like switches. Our latest 2U4N servers are rated at 1800W. That means we can only fit a maximum of four of these 2U4N chassis (8600W / 1800W = 4.7 chassis). Rounding them up to 5 would reach a total rack level power consumption of 9kW, so that’s a no-no.

Splitting 4 chassis into 3 branches evenly is impossible, and will force us to have one of the branches to have 2 chassis. That would lead to a non-ideal phase balancing:

Keeping phase currents under 24A, there’s only 1.1A (24A – 22.9A) to add on L1 or L2 before the PDU gets overloaded. Say we want to add as many machines as we can under the PDU’s power capacity. One solution is we can add up to 242W on the L1/L2 branch until both L1 and L2 currents reach their 24A limit.

Alternatively, we can add up to 298W on the L2/L3 branch until L2 current reaches 24A. Note we can also add another 298W on the L3/L1 branch until L1 current reaches 24A.

In the examples above, we can see that various solutions are possible. Adding two 298W machines each at L2/L3 and L3/L1 is the most phase balanced solution, given the parameters. Nonetheless, PDU capacity isn’t optimized at 7.8kW.

Dealing with a 1800W server is not ideal, because whichever branch we choose to power one would significantly swing the phase balance unfavorably. Thankfully, our Gen X servers take up less space and are more power efficient. Smaller footprint allows us to have more flexibility and fine-grained control over our racks in many of our diverse data centers. Assuming each 1U server is 450W, as if we physically split the 1800W 2U4N into fours each with their own power supplies, we’re now able to fit 18 nodes. That’s 2 more nodes than the four 2U4N setup:

Adding two more servers here means we’ve increased our value by 12.5%. While there are more factors not considered here to calculate the Total Cost of Ownership, this is still a great way to show we can be smarter with asset costs.

Cloudflare provides the back-end services so that our customers can enjoy the performance, reliability, security, and global scale of our edge network. Meanwhile, we manage all of our hardware in over 100 countries with various power standards and compliances, and ensure that our physical infrastructure is as reliable as it can be.

There’s no Cloudflare without hardware, and there’s no Cloudflare without power. Want to know more? Watch this Cloudflare TV segment about power: https://cloudflare.tv/event/7E359EDpCZ6mHahMYjEgQl.