Tag Archives: Cloud Storage

Shooting for the Clouds: How One Photo Storage Service Moved Beyond Physical Devices

Post Syndicated from Barry Kaufman original https://www.backblaze.com/blog/shooting-for-the-clouds-how-one-photo-storage-service-moved-beyond-physical-devices/

The sheer number of creative and unique ways our customers and partners utilize Backblaze B2 Cloud Storage never ceases to amaze us. Whether it’s pairing our storage with a streaming platform to deliver seamless video or protecting research data that is saving lives, we applaud their ingenuity. From time to time, we like to put the spotlight on one of these inspired customers, which brings us to the company we’re highlighting today: Monument, a photo management service with a strong focus on security and privacy.


Situation: The Monument story started with a physical device where customers could securely save photos, but they saw the winds shifting to the cloud. They wanted to offer users the flexibility and automation that the cloud provides while maintaining their focus on privacy and security.

Solution: Monument launched their cloud-based offering, Monument Cloud, with Backblaze as its storage backbone. User photos are encrypted and stored in Backblaze B2 Cloud Storage, and are accessible via the Monument Cloud app.

Result: Monument Cloud eliminates the need for users to maintain a physical device at their homes or offices. Users just install the Monument Cloud app on their devices and their photos and videos are automatically backed up, fully encrypted, organized, and shareable.

What Is Monument?

Monument was founded in 2016 by a group of engineers and designers who wanted an easy way to back up and organize their photos without giving up their privacy and security. Since smartphones saturated the market, the average person’s digital photo archive has grown exponentially. The average user has around 2,100 photos on their smartphone at any given time, and that’s not even counting the photos stashed away on various old laptops, hard drives, USBs, and devices.

Photo management services like Google Photos stepped in to help folks corral all of those memories. But, most photo management services are a black box—you don’t know how they’re using your data or your images. Monument wanted to give folks the same functionality as something like iCloud or Google Photos while also keeping their private data private.

“There are plenty of photo storage solutions right now, but they come with limitations and fail to offer transparency about their privacy policies—how photos are being used or processed” said Monument’s co-founder Ercan Erciyes. “At Monument, we reimagined how we store and access our photos and provided a clutter-free experience while keeping users in the center, not their personal data.”

They launched their first generation product in 2017—a physical storage device with advanced AI software that helps users manage photo libraries between devices and organize photos by faces, scenery, and other properties. The hardware side was fueled by two rounds of Kickstarter funding, each helping create new versions of the company’s smart storage device powered by a neural processing unit (NPU) that lived on-device and allowed access from anywhere.

An Eye for Secure Photo Storage

That emphasis on privacy fueled the software side of Monument’s offering, an AI-driven approach that allows easy searchability of photos without processing any of the metadata on Monument’s end. Advanced image recognition couples with slick de-duplication features for an experience that catalogs photos without exposing photographers’ data to algorithms that influence their choices. No ads, no profiling, no creepy trackers, and Monument doesn’t use or sell customers’ personal data.

We were getting a lot of questions along the lines of, “What happens if my house catches fire?” or “What if there is physical damage to the device?” so we could see there was a lot of interest in a cloud solution.”

—Ercan Erciyes, Co-Founder, Monument Labs, Inc.

The Gathering Cloud

With the rise of cloud storage, Monument saw their typical consumer shifting away from on-prem solutions. “We were getting a lot of questions along the lines of, ‘What happens if my house catches fire?’ or ‘What if there is physical damage to the device?’ so we could see there was a lot of interest in a cloud solution,” said Ercan. “Plus there were a lot of users that didn’t want a physical device in their home.”

Their answer: Offer the same privacy-first service through a comprehensive cloud solution.

Using Free Credits Wisely

Launching a cloud-based storage service built around their philosophy of privacy and security was a clear necessity for the company’s future. To kick off their move to the cloud, Monument utilized free startup credits from AWS. But, they knew free credits wouldn’t last forever. Rather than using the credits to build a minimum viable product as fast as humanly possible, they took a very measured approach. “The credits are sweet,” Ercan said, “But you need to pay attention to your long-term vision. You need to have a backup plan, so to speak.” (We think so, too.)

Ercan ran the numbers with success in mind and realized they’d ultimately lose money if they built the infrastructure for Monument Cloud on AWS. He also didn’t want to accumulate tech debt and become locked in to AWS.

They ended up using the credits to develop the AI model, but not to build their infrastructure. For that they turned to specialized cloud providers.

Integrating Backblaze B2 Cloud Storage

Monument created a lean tech stack that incorporated Backblaze B2 for long-term encrypted storage. They run their AI software on Vultr, a Backblaze compute partner that offers free egress fees between the two services. And, they use another specialized cloud provider to store thumbnails that are displayed in the Monument Cloud app. The cloud service has quickly become the company’s flagship offering, drawing 25,000 active users.

Group Photos: Serving New Customers

With infrastructure that will scale without cutting into their margins, Monument is poised to serve an increasing number of customers who care about what happens to their personal data. More and more, customers are seeking out alternatives to big name cloud providers, using services like DuckDuckGo instead of Google Search or WhatsApp instead of garden variety text messaging apps. With a distributed, multi-cloud system, they can serve these types of customers with a cloud option while keeping data privacy front and center. And the customers that gravitate to this value proposition are wide-ranging.

Of course, the first ones you might think of would be prolific photo takers or even amateur photographers, but Ercan pointed out some surprising use cases for their technology. “We are seeing a lot of different use cases coming up from schools, real estate companies, and even elder care systems,” he said. With Monument’s new cloud solution, classrooms are exploring new online frontiers in education, and families scattered around the world are able to share photos with their elderly relatives.

A Monument to Security

Challenging monster brands like Google is no small task as a small team of just five people. Monument does it by keeping a laser focus on their core values and their customers’ needs. “If you keep the user’s needs in the center, building a solution doesn’t require an army of engineers,” Ercan said. Without having to worry about how to use customer data to build algorithms that keep advertisers happy, Monument can focus on serving their customers what they actually need—a photo management solution that just works.

Monument Co-founders Semih Hazar (left) and Ercan Erciyes (right)

Monument and Backblaze

Whether you’re the family photographer, the office party chronicler, or you just have a convoluted system of hard drives stickered and slotted onto a shelf somewhere that you’d like to get rid of, first and foremost: Make sure you’re availing yourself of the very reasonable storage available from Backblaze for archiving or backing up your data.

After you’re done with that: Check out Monument.

The post Shooting for the Clouds: How One Photo Storage Service Moved Beyond Physical Devices appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Backblaze Drive Stats for 2022

Post Syndicated from Andy Klein original https://www.backblaze.com/blog/backblaze-drive-stats-for-2022/

As of December 31, 2022, we had 235,608 drives under management. Of that number, there were 4,299 boot drives and 231,309 data drives. This report will focus on our data drives. We’ll review the hard drive failure rates for 2022, compare those rates to previous years, and present the lifetime failure statistics for all the hard drive models active in our data center as of the end of 2022. Along the way, we’ll share our observations and insights on the data presented and, as always, we look forward to you doing the same in the comments section at the end of the post.

2022 Hard Drive Failure Rates

At the end of 2022, Backblaze was monitoring 231,309 hard drives used to store data. For our evaluation, we removed 388 drives from consideration which were used for either testing purposes or drive models for which we did not have at least 60 drives. This leaves us with 230,921 hard drives to analyze for this report.

Observations and Notes

One Zero for the Year

In 2022, only one drive had zero failures, the 8TB Seagate (model: ST8000NM000A). That “zero” does come with some caveats: We have only 79 drives in service and the drive has a limited number of drive days—22,839. These drives are used as spares to replace 8TB drives that have failed.

What About the Old Guys?

  • The 6TB Seagate (model: ST6000DX000) drive is the oldest in our fleet with an average age of 92.5 months. In 2021, it had an annualized failure rate (AFR) of just 0.11%, but has slipped a bit to 0.68% for 2022. A very respectable number any time, but especially after nearly eight years of duty.
  • The 4TB Toshiba (model: MD04ABA400V) drives have an average age of 91.3 months. In 2021, this drive has an AFR of 2.04% and that has jumped to 3.13% for 2022, which included three drive failures. Given the limited number of drives and drive days for this model, if there were only two drive failures in 2022, the AFR would be 2.08%, or nearly the same as 2021.
  • Both of these drive models have a relatively small number of drive days, so confidence in the AFR numbers is debatable. That said, both drives have performed well over their lifespan.

New Models

In 2021, we added five new models while retiring zero, giving us a total of 29 different models we are tracking. Here are the five new models:

  1. HUH728080ALE604–8TB
  2. ST8000NM000A–8TB
  3. ST16000NM002J–16TB
  4. MG08ACA16TA–16TB
  5. WUH721816ALE6L4–16TB

The two 8TB drive models are being used to replace failed 8TB drives. The three 16TB drive models are additive to the inventory.

Comparing Drive Stats for 2020, 2021, and 2022

The chart below compares the AFR for each of the last three years. The data for each year is inclusive of that year only and the operational drive models present at the end of each year.

Drive Failure Was Up in 2022

After a slight increase in AFR from 2020 to 2021, there was a more notable increase in AFR in 2022 from 1.01% in 2021 to 1.37%. What happened? In our Q2 2022 and Q3 2022 quarterly Drive Stats reports, we noted an increase in the overall AFR from the previous quarter and attributed it to the aging fleet of drives. But, is that really the case? Let’s take a look at some of the factors at play that could cause the rise in AFR for 2022. We’ll start with drive size.

Drive Size and Drive Failure

The chart below compares 2021 and 2022 AFR for our large drives (which we’ve defined as 12TB, 14TB, and 16TB drives) to our smaller drives (which we’ve defined as 4TB, 6TB , 8TB, and 10TB drives).

With the exception of the 16TB drives, every drive size had an increase in their AFR from 2021 to 2022. In the case of the small drives, the increase was pronounced, and at 2.12% is well above the 1.37% AFR for 2022 for all drives.

In addition, while the small drive cohort represents only 28.7% of the drive days in 2022, they account for 44.5% of the drive failures. Our smaller drives are failing more often, but they are also older, so let’s take a closer look at that.

Drive Age and Drive Failure

When examining the correlation of drive age to drive failure we should start with our previous look at the hard drive failure bathtub curve. There we concluded that drives generally fail more often as they age. To see if that matters here, we’ll start with the table below which shows the average age of each drive model of drives by size.

With the exception of the 8TB Seagate (model: ST8000NM000A), which we recently purchased as replacements for failed 8TB drives, the drives fall neatly into our two groups noted above—10TB and below and 12TB and up.

Now let’s group the individual drive models into cohorts defined by drive size. But before we do, we should remember that the 6TB and 10TB drive models have a relatively small number of drives and drive days in comparison to the remaining drive groups. In addition, the 6TB and 10TB drive cohorts consist of one drive model, while the other drive groups have at least four different drive models. Still, leaving them out seems incomplete, so we’ve included tables with and without the 6TB and 10TB drive cohorts.

Each table shows the relationship for each drive size, between the average age of the drives and their associated AFR. The chart on the right (V2) clearly shows that the older drives, when grouped by size, fail more often. This increase as a drive model ages follows the bathtub curve we spoke of earlier.

So, What Caused the Increase in Drive Failure and Does it Matter?

The aging of our fleet of hard drives does appear to be the most logical reason for the increased AFR in 2022. We could dig in further, but that is probably moot at this point. You see, we spent 2022 building out our presence in two new data centers, the Nautilus facility in Stockton, California and the CoreSite facility in Reston, Virginia. In 2023, our focus is expected to be on replacing our older drives with 16TB and larger hard drives. The 4TB drives and yes, even our O.G. 6TB Seagate drives could go. We’ll keep you posted.

Drive Failures by Manufacturer

We’ve looked at drive failure by drive age and drive size, so it’s only right to look at drive failure by manufacturer. Below we have plotted the quarterly AFR over the last three years by manufacturer.

Starting in Q1 of 2021 and continuing to the end of 2022, we can see that the overall rise in the overall AFR over that time seems to be driven by Seagate and, to a lesser degree, Toshiba, although HGST contributes heavily to the Q1 2022 rise. In the case of Seagate, this makes sense as most of our Seagate drives are significantly older than any of the other manufacturers’ drives.

Before you throw your Seagate and Toshiba drives in the trash, you might want to consider the lifecycle cost of a given hard drive model versus its failure rate. We looked at this in our Q3 2022 Drive Stats report, and outlined the trade-offs between drive cost and failure rates. For example, in general, Seagate drives are less expensive and their failure rates are typically higher in our environment. But, their failure rates are typically not high enough to make them less cost effective over their lifetime. You could make a good case that for us, many Seagate drive models are just as cost effective as more expensive drives. It helps that our B2 Cloud Storage platform is built with drive failure in mind, but we’ll admit that fewer drive failures is never a bad thing.

Lifetime Hard Drive Stats

The table below is the lifetime AFR of all the drive models in production as of December 31, 2022.

The current lifetime AFR is 1.39%, which is down from a year ago (1.40%) and also down from last quarter (1.41%). The lifetime AFR is less prone to rapid changes due to temporary fluctuations in drive failures and is a good indicator of a drive model’s AFR. But it takes a fair amount of observations (in our case, drive days) to be confident in that number. To that end, the table below shows only those drive models which have accumulated one million drive days or more in their lifetime. We’ve ordered the list by drive days.

Finally, we are going to open up a bit here and share the results of the 388 drives we removed from our analysis because they were test drives or drive models with 60 or fewer drives. These drives are divided amongst 20 different drive models and the table below lists those drive models which were operational in our data centers as of December 31, 2022. Big caveat here: these are just test drives and so on, so be gentle. We usually ignore them in the reports, so this is their chance to shine, or not. We look forward to seeing your comments.

There are many reasons why these drives got to this point in their Backblaze career, but we’ll save those stories for another time. At this point, we’re just sharing to be forthright about the data, but there are certainly tales to be told. Stay tuned.

Our Annual Drive Stats Webinar

Join me on Tuesday, February 7 at 10 a.m. PT to review the results of the 2022 report. You’ll get a look behind the scenes at the data and the process we use to create the annual report.

Sign Up for the Webinar

The Hard Drive Stats Data

The complete data set used to create the tables and charts in this report is available on our Hard Drive Test Data page. You can download and use this data for free for your own purpose. All we ask are three things: 1) you cite Backblaze as the source if you use the data, 2) you accept that you are solely responsible for how you use the data, and 3) you do not sell this data itself to anyone; it is free.

If you just want the data used to create the tables and charts in this blog post you can download the ZIP file containing the CSV files for each chart.

Good luck and let us know if you find anything interesting.

Want More Insights?

Check out our take on Hard Drive Cost per Gigabyte and Hard Drive Life Expectancy.

Interested in the SSD Data?

Read the most recent SSD edition of our Drive Stats Report.

The post Backblaze Drive Stats for 2022 appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Simplify Data Protection with Backblaze and Commvault

Post Syndicated from Jennifer Newman original https://www.backblaze.com/blog/simplify-data-protection-with-backblaze-and-commvault/

The most effective backups are the ones you never have to think about—It’s that simple. For anyone in charge of data protection—IT Admins, IT Directors, CTOs and CIOs, managed service providers, and others—driving to that level of simplicity is always the goal. A new partnership between Backblaze and Commvault brings you one step closer to achieving that goal.

Now, Commvault customers can select Backblaze B2 as a cloud storage destination for their Commvault backups and data management needs. Read on to learn more about the partnership.

What Is Commvault?

Commvault is a global leader in data management. Their Intelligent Data Services help organizations transform how they protect, store, and use data. They offer a simple, unified Data Management Platform that spans all of a company’s data, no matter where it lives—on-premises, or in a hybrid or multi-cloud environment—or how it’s structured—in legacy applications, databases, virtual machines, or in containers.

How Does This Partnership Benefit Joint Customers?

Joint customers gain access to easy, affordable cloud storage that integrates with Commvault’s software. The partnership benefits joint customers in a few key ways:

  • Quick setup: Get started with a seamless integration.
  • Easy administration: Manage data in one platform.
  • Better backups: Protect your data from ransomware risks, equipment failure, damage, theft, and human error.
  • Faster recoveries: Restore your environment quickly in the event of a disaster.
  • Affordable storage: Backblaze is ⅕ the cost of major cloud providers.

Take Advantage of Capacity-Based Pricing with Backblaze B2 Reserve

Joint customers who prefer predictable cloud spend rather than consumption-based pricing can take advantage of Backblaze B2 Reserve. The Backblaze B2 Reserve offering is capacity-based, starting at 20TB, with key features, including:

  • Free egress up to the amount of storage purchased per month.
  • Free transaction calls.
  • Enhanced migration services.
  • No delete penalties.
  • Upgraded Tera support.

Customers can purchase B2 Reserve through our channel partners. If you’re interested in participating or just want to learn more, contact our Sales team.

If you’re a channel partner and Commvault is in your suite of offerings, we’d love to engage with you. Register on our Partner Portal to get started with offering Backblaze B2 as a backup target.

Customer Spotlight: How Pittsburg State Protects Data in Tornado Alley

Pittsburg State University, located in the heart of Tornado Alley in Kansas, took steps to protect their data by deploying private cloud infrastructure via Commvault Distributed Storage. They established two nodes on-premises and a third across the state for geographic separation, but they wanted another layer of protection. They added Backblaze B2 Cloud Storage giving them peace of mind that their data would be better protected from threats like ransomware. Since Backblaze is integrated with Commvault, Commvault de-duplicates the data, then sends a copy to Backblaze nightly.

“Backblaze B2 had the capability we lacked. I bolted it onto our system, so now I have off-site backup that is safe and well-protected from a regional disaster in Kansas.”
—Tim Pearson, Director for IT Infrastructure and Security, Pittsburg State University

Getting Started with Backblaze B2 and Commvault

Ready to simplify your Commvault backup storage? Check out our Commvault Quickstart Guide for a walk through on how to set up Backblaze B2 as your Commvault cloud storage target.

The post Simplify Data Protection with Backblaze and Commvault appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Build a Cloud Storage App in 30 Minutes

Post Syndicated from Pat Patterson original https://www.backblaze.com/blog/build-a-cloud-storage-app-in-30-minutes/

The working title for this developer tutorial was originally the “Polyglot Quickstart.” It made complete sense to me—it’s a “multilingual” guide that shows developers how to get started with Backblaze B2 using different programming languages—Java, Python, and the command line interface (CLI). But the folks on our publishing and technical documentation teams wisely advised against such an arcane moniker.

Editor’s Note

Full disclosure, I had to look up the word polyglot. Thanks, Merriam-Webster, for the assist.

Polyglot, adjective.
1a: speaking or writing several languages: multilingual
1b: composed of numerous linguistic groups; a polyglot population
2: containing matter in several languages; a polyglot sign
3: composed of elements from different languages
4: widely diverse (as in ethnic or cultural origins); a polyglot cuisine

Fortunately for you, readers, and you, Google algorithms, we landed on the much easier to understand Backblaze B2 Developer Quick-Start Guide, and we’re launching it today. Read on to learn all about it.

Start Building Applications on Backblaze B2 in 30 Minutes or Less

Yes, you heard that correctly. Whether or not you already have experience working with cloud object storage, this tutorial will get you started building applications that use Backblaze B2 Cloud Storage in 30 minutes or less. You’ll learn how scripts and applications can interact with Backblaze B2 via the AWS SDKs and CLI and the Backblaze S3-compatible API.

The tutorial covers how to:

  • Sign up for a Backblaze B2 account.
  • Create a public bucket, upload and view files, and create an application key using the Backblaze B2 web console.
  • Interact with the Backblaze B2 Storage Cloud using Java, Python, and the CLI: listing the contents of buckets, creating new buckets, and uploading files to buckets.

This first release of the tutorial covers Java, Python, and the CLI. We’ll add more programming languages in the future. Right now we’re looking at JavaScript, C#, and Go. Let us know in the comments if there’s another language we should cover!

➔ Check Out the Guide

What Else Can You Do?

If you already have experience with Amazon S3, the Quick-Start Guide shows how to use the tools and techniques you already know with Backblaze B2. You’ll be able to quickly build new applications and modify existing ones to interact with the Backblaze Storage Cloud. If you’re new to cloud object storage, on the other hand, this is the ideal way to get started.

Watch this space for future tutorials on topics such as:

  • Downloading files from a private bucket programmatically.
  • Uploading large files by splitting them into chunks.
  • Creating pre-signed URLs so that users can access private files securely.
  • Deleting versions, files and buckets.

Want More?

Have questions about any of the above? Curious about how to use Backblaze B2 with your specific application? Already a wiz at this and ready to do more? Here’s how you can get in touch and get involved:

  • Sign up for Backblaze’s virtual user group.
  • Find us at Developer Week.
  • Let us know in the comments which programming languages we should add to the Quick-Start Guide.

The post Build a Cloud Storage App in 30 Minutes appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Ransomware Takeaways Q4 2022

Post Syndicated from Jeremy Milk original https://www.backblaze.com/blog/ransomware-takeaways-q4-2022/

It may seem like ransomware is not in the news as much as it was in 2021 and the first part of 2022. Back then, major attacks and record-breaking ransom demands dominated headlines, and governments took action to make life more difficult for cybercriminals. But the spotlight is never a good place to be when you’re trying to defraud companies to the tune of millions of dollars. So, while you might be hearing about it less, that doesn’t mean that the threat of cybercrime is negligible. Exactly the opposite—the lack of media attention makes potential victims lower their guard, leaving vulnerabilities that cybercriminals love to exploit.

Staying up-to-date on the latest ransomware news keeps you informed of potential threats. And, keeping the latest threats fresh in your mind means you’ll be ready if and when cybercriminals turn their sights in your direction. We all hope that never happens, but it’s wise to be prepared in case it does. To arm you with the latest, here are some of the biggest developments in ransomware that we observed in Q4 2022.

This post is a part of our ongoing series on ransomware. Take a look at our other posts for more information on how businesses can defend themselves against a ransomware attack, and more.

And, don’t forget that we offer a thorough walkthrough of ways to prepare yourself and your business for ransomware attacks—free to download below.

➔ Download The Complete Guide to Ransomware

1. Many Ransomware Attacks Go Unreported in the Media

One possible reason you don’t hear about ransomware attacks is that they simply don’t get reported in the news. A study released in late 2022 by Jumpsec found that 86% of ransomware attacks go unreported in typical media sources in the UK. The attacks that do get covered are typically ones where the victims are legally required to disclose the attacks due to personally identifiable information (PII) being compromised. While public disclosure is uncommon, keep in mind that reporting requirements—that is, the legal requirement to disclose to the authorities—in the UK, U.S., and elsewhere are becoming more stringent. For example, in 2022, President Biden signed a bill into law that requires operators of critical infrastructure to disclose cyber attacks to the government within 72 hours.

Key Takeaway

It may seem like there’s no real incentive to disclose a cyberattack publicly. Why serve the greater good at the expense of your reputation, right? But, some organizations have found that being open and honest positions them ahead of the game. Chip Daniels, head of government affairs at SolarWinds, shared the positive response the company has received about their transparency, “I meet with somebody for the first time, they’ll say, ‘I just want to tell you, you guys are the gold standard on how you should respond to a cyber incident.’” Being seen as the “gold standard” isn’t a bad place to land after an attack.

2. Hospitals and Schools Continued to Be Targeted

Sadly, it’s not the first time we reported on the threat to hospitals and schools. It was highlighted in our very first Ransomware Takeaways report. In Q4 2022, cybercriminals showed no sign of letting up as CommonSpirit Health, a Chicago-based health provider with more than 700 care sites and 142 hospitals in 21 states, suffered a major attack that made patient records vulnerable. And earlier in the year, over Labor Day weekend, one of the largest school districts in the country—the Los Angeles Unified School District—was attacked as well.

Key Takeaway

Nonprofit and public sector institutions need budget-friendly options for implementing ransomware protection that work with their existing purchasing programs. Through government IT aggregators like Carahsoft, public sector decision makers can purchase affordable, capacity-based cloud storage to support their recovery objectives.

3. Ransomware Attacks Take a Psychological Toll

In news that should come as a surprise to no one who’s been through a ransomware incident, cyberattacks take a psychological toll, and new research from cybersecurity company Northwave released in Q4 2022 quantifies it. They measured the mental impacts of ransomware attacks at three points in time, within the first week, month, and year after an attack. At a month out, 75% reported having negative thoughts, and at one year, 14% reported symptoms of trauma requiring professional help.

Key Takeaway

Companies involved in a ransomware attack can take action to minimize negative effects on employees’ mental health. Northwave recommends having regular check-ins and breaks during the first phase, making space for rest and recovery time in the second phase, and creating an open environment in the third phase, where employees can talk about what happened and decompress.

4. Some Ransomware Is Badly Made, and All the More Dangerous

Researchers analyzed the Cryptonite ransomware strain, which first appeared in October 2022, and found that its “barebones” functionality makes it even more of a threat—there’s no way to recover encrypted files. Researchers point out that it’s likely not an intentional feature, but simply poor design.

Key Takeaway

Since the software is broken to the point where decryption is impossible, there’s absolutely no reason to pay the ransom if you fall victim to a Cryptonite attack. Instead, it makes sense to spend some time creating a disaster recovery plan so you can resume normal business operations as soon as possible. Researchers also report that phishing seems to be the most common attack vector for this ransomware strain, so it’s a good idea to ramp up your cybersecurity training.

5. A Vast Majority of Ransomware Attacks Attempted to Infect Backups

In November, Veeam released their 2022 Ransomware Trends report, a study of more than 3,000 organizations across 28 countries. Among their key findings: 95% of ransomware attacks attempted to infect backups. Of those attacks that targeted backups, 38% of respondents had some backup repositories impacted, and 30% had all of their backup repositories impacted.

Key Takeaway

One word: immutability. Protecting backups with Object Lock costs nothing to implement and prevents backups from being modified or encrypted by ransomware. With backups that can’t be altered, recoveries are much easier and more reliable.

Closing Thoughts

While you may not be hearing about as many high profile ransomware attacks as you once were, make no mistake that they’re still happening. Just know that there are steps you can take to keep your company from becoming the next victim, including protecting data with Object Lock, applying security best practices, and creating a disaster recovery plan.

The post Ransomware Takeaways Q4 2022 appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

How to Serve Data From a Private Bucket with a Cloudflare Worker

Post Syndicated from Pat Patterson original https://www.backblaze.com/blog/how-to-serve-data-from-a-private-bucket-with-a-cloudflare-worker/

Customers storing data in Backblaze B2 Cloud Storage enjoy zero-cost downloads via our Content Delivery Network (CDN) partners: Cloudflare, Fastly, and Bunny.net. Configuring a CDN to proxy access to a Backblaze B2 Bucket is straightforward and improves the user experience, since the CDN caches data close to end-users. Ensuring that end-users can only access content via the CDN, and not directly from the bucket, requires a little more effort. A new technical article, Cloudflare Workers for Backblaze B2, provides the steps to serve content from Backblaze B2 via your own Cloudflare Worker.

In this blog post, I’ll explain why you might want to prevent direct downloads from your Backblaze B2 Bucket, and how you can use a Cloudflare Worker to do so.

Why Prevent Direct Downloads?

As mentioned above, Backblaze’s partnerships with CDN providers allow our customers to deliver content to end users with zero costs for data egress from Backblaze to the CDN. To illustrate why you might want to serve data to your end users exclusively through the CDN, let’s imagine you’re creating a website, storing your website’s images in a Backblaze B2 Bucket with public-read access, acme-images.

For the initial version, you build web pages with direct links to the images of the form https://acme-images.s3.us-west-001.backblazeb2.com/logos/acme.png. As users browse your site, their browsers will download images directly from Backblaze B2. Everything works just fine for users near the Backblaze data center hosting your bucket, but the further a user is from that data center, the longer it will take each image to appear on screen. No matter how fast the network connection, there’s no getting around the speed of light!

Aside from the degraded user experience, there are costs associated with end users downloading data directly from Backblaze. The first GB of data downloaded each day is free, then we charge $0.01 for each subsequent GB. Depending on your provider’s pricing plan, adding a CDN to your architecture can both reduce download costs and improve the user experience, as the CDN will transfer data through its own network and cache content close to end users. Another detail to note when comparing costs is that Backblaze and Cloudflare’s Bandwidth Alliance means that data flows from Backblaze to Cloudflare free of download charges, unlike data flowing from, for example, Amazon S3 to Cloudflare.

Typically, you need to set up a custom domain, say images.acme.com, that resolves to an IP address at the CDN. You then configure one or more origin servers or backends at the CDN with your Backblaze B2 Buckets’ S3 endpoints. In this example, we’ll use a single bucket, with endpoint acme-images.s3.us-west-001.backblazeb2.com, but you might use Cloud Replication to replicate content between buckets in multiple regions for greater resilience.

Now, after you update the image links in your web pages to the form https://images.acme.com/logos/acme.png, your users will enjoy an improved experience, and your operating costs will be reduced.

As you might have guessed, however, there is one chink in the armor. Clients can still download images directly from the Backblaze B2 Bucket, incurring charges on your Backblaze account. For example, users might have bookmarked or shared links to images in the bucket, or browsers or web crawlers might have cached those links.

The solution is to make the bucket private and create an edge function: a small piece of code running on the CDN infrastructure at the images.acme.com endpoint, with the ability to securely access the bucket.

Both Cloudflare and Fastly offer edge computing platforms; in this blog post, I’ll focus on Cloudflare Workers and cover Fastly [email protected] at a later date.

Proxying Backblaze B2 Downloads With a Cloudflare Worker

The blog post Use a Cloudflare Worker to Send Notifications on Backblaze B2 Events provides a brief introduction to Cloudflare Workers; here I’ll focus on how the Worker accesses the Backblaze B2 Bucket.

API clients, such as Workers, downloading data from a private Backblaze B2 Bucket via the Backblaze S3 Compatible API must digitally sign each request with a Backblaze Application Key ID (access key ID in AWS parlance) and Application Key (secret access key). On receiving a signed request, the Backblaze B2 service verifies the identity of the sender (authentication) and that the request was not changed in transit (integrity) before returning the requested data.

So when the Worker receives an unsigned HTTP request from an end user’s browser, it must sign it, forward it to Backblaze B2, and return the response to the browser. Here are the steps in more detail:

  1. A user views a web page in their browser.
  2. The user’s browser requests an image from the Cloudflare Worker.
  3. The Worker makes a copy of the incoming request, changing the target host in the copy to the bucket endpoint, and signs the copy with its application key and key ID.
  4. The Worker sends the signed request to Backblaze B2.
  5. Backblaze B2 validates the signature, and processes the request.
  6. Backblaze B2 returns the image to the Worker.
  7. The Worker forwards the image to the user’s browser.

These steps are illustrated in the diagram below.

The signing process imposes minimal overhead, since GET requests have no payload. The Worker need not even read the incoming response payload into memory, instead returning the response from Backblaze B2 to the Cloudflare Workers framework to be streamed directly to the user’s browser.

Now you understand the use case, head over to our newly published technical article, Cloudflare Workers for Backblaze B2, and follow the steps to serve content from Backblaze B2 via your own Cloudflare Worker.

Put the Proxy to Work!

The Cloudflare Worker for Backblaze B2 can be used as-is to ensure that clients download files from one or more Backblaze B2 Buckets via Cloudflare, rather than directly from Backblaze B2. At the same time, it can be readily adapted for different requirements. For example, the Worker could verify that clients pass a shared secret in an HTTP header, or route requests to buckets in different data centers depending on the location of the edge server. The possibilities are endless.

How will you put the Cloudflare Worker for Backblaze B2 to work? Sign up for a Backblaze B2 account and get started!

The post How to Serve Data From a Private Bucket with a Cloudflare Worker appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Stable Diffusion and Backblaze: Create a Masterpiece from a Bucket of Your Own Images

Post Syndicated from Troy Liljedahl original https://www.backblaze.com/blog/stable-diffusion-and-backblaze-create-a-masterpiece-from-a-bucket-of-your-own-images/

AI is really having a moment. There’s DALL-E, Lensa, ChatGPT. Your social media feed is probably full of new avatars and AI-generated haiku. Naturally, we at Backblaze were intrigued by this brave new world of AI-generated content. The technology has been wildly popular, but is not without controversy, raising questions about intellectual property, copyright law, artist disenfranchisement, possible displacement of jobs, and general fear over the rise of the machines. On the other side of that coin, there’s definitely a place for AI in the future of work and life. So, I wanted to experiment with it.

Let’s start with Stable Diffusion.

Stable Diffusion is one of the new text-to-image technologies popping up all over the internet that allows users to input words and phrases and get back amazing pictures created by its deep learning model. What makes Stable Diffusion so interesting is that it has been open sourced to allow anyone to create their own models for text-to-image generation.

Today, I’ll walk through how you can do just that using Backblaze B2 Cloud Storage.

Kicking the Stable Diffusion Tires

After playing with an online instance of Stable Diffusion, I sought out content on some more ways to use the AI tool. I found several examples of how to use Stable Diffusion with your own images like this one and this one. The most common use case for this was taking advantage of AI to create art from a model based on your own face. Sounds cool, right? But what if I also had a bunch more pictures in Backblaze B2 Cloud Storage? Could I do the same thing to create art, graphics, branded images, and more, from my content in the cloud? The answer is a resounding YES.

Use Cases for Stable Diffusion

For me, this was a fun experiment, but we see a number of different ways this set up could be used both individually and for businesses. I started with about 20 images or so as fodder for Stable Diffusion’s algorithm. But, that’s just the beginning.

Let’s say you’re a marketing team at a small company. You could use Stable Diffusion’s paid version and get access to hundreds of thousands of random images from Google, but you really only care about analyzing and generating photos that are relevant to your business. So, you run Stable Diffusion in a cloud compute instance and have it analyze a Backblaze B2 Bucket where you store your own library of images, which you’ve probably been collecting for years. Set up that way, you have your own customized AI engine that analyzes and generates only images that are pertinent to your needs, rather than a bunch of images you don’t care about.

In this experiment, I used Google Colab, which worked well for my needs. But for a real implementation, you could use a Backblaze cloud compute partner like Vultr. Egress between Backblaze and Vultr is free, so the analysis won’t cost you anything beyond what it costs to use the two services.

This could be hugely useful for marketing teams, but we also see the value for individuals or businesses who want to keep their data private but still take advantage of AI technology. This way, you aren’t serving up images on public sites.

So, how does it all work? Let’s get into it.

Getting Started with Stable Diffusion and Backblaze B2

What you’ll need:

  • A Backblaze B2 account. You can sign up for free here.
  • A Google account.
  • A smartphone to take pictures if you don’t have 20 or so pictures of whatever subject you want to use lying around.
  • Whatever software tool you’d like to use to mount Backblaze B2 as a drive on your computer. I use Rclone in this example but any cloud drive software will work.

The first thing you’ll need to do is create an account at Hugging Face. Hugging Face is the home of the modern AI community and is where Stable Diffusion lives. In your Hugging Face account, navigate to your Account Settings and go to Access Tokens—we’ll need one of these to allow our environment to use the Stable Diffusion engines.

Now as to the environment, this can be on your own computer, in a virtual machine (VM), really wherever. My favorite (and free) method I found was a Google Colab notebook created by GitHub user TheLastBen that makes the process so incredibly simple that anyone can do this. The Colab notebook also takes advantage of DreamBooth, a Google Research project that provides for incredible detail on the art and images created by a diffusion model. In short, this is the easiest way to get really good looking AI art. You can get started with the Colab notebook here.

In the Colab notebook there are a ton of different options and a great step-by-step guide that explains them, but I’ll walk you through the basic settings to get going:

  1. First, hit the Play button next to Dependencies.
  2. Once that’s done, copy your User Access Token from your Hugging Face account.
  3. In the Model Download section, paste that User Access Token into the Huggingface_Token field.
  4. Click the Play button for Model Download.
  5. You’ll see the script run below all the fields here. You can proceed when you see “DONE!”
  6. Finally, in the Dreambooth section, provide a name in the Session_Name field. This will be the name of the session that gets saved in your Google Drive. That name can be reused later to skip these steps next time.

Training the Stable Diffusion Model

Now the pictures: You’ll want at least 20 pictures or so for your AI model to analyze in order to avoid creating a bunch of generic person art or nightmare fuel. So bust out your phone and take some selfies! If you have a friend to throw in two or three full body pictures this will help as well. A few optional tips:

  • Use different expressions and angles.
  • Use different backgrounds if you can.
  • Use a square or 1:1 ratio setting. By default, Stable Diffusion’s default image size is 512 x 512 pixels, so using square images makes your input more similar to your desired output.

If you’re an iPhone user, you will need to take one extra step here to save your files in JPEG format. You can find a guide for that in this article.

As you save your photos, make sure the file names include the name you’re going to use when generating your AI art. For example, my photos were all named troy (1).jpg, troy (2).jpg, troy (3).jpg, etc. This is important so that the AI understands what to call you (or your subject) when generating your images.

Once you have your photos, it’s time to upload them to a Bucket in Backblaze B2 Cloud Storage. You can easily do this in the Backblaze mobile app or on the Backblaze website.

With your selfies safely in Backblaze B2, make sure you make them accessible on your computer using a tool such as Rclone mount. If you don’t have an account yet, you can check out our guide on how to set up and configure Rclone mount.

You might be wondering why you should upload the photos to a Backblaze B2 Bucket and then mount the Bucket so that we can access it locally, rather than just saving the files to a local folder?

The answer is simple. In this example, we’re working with a few images representing a single subject, so you likely won’t have issues working from your local drive. As you further experiment with more subjects and more images of each subject, you’ll likely outgrow your local drive. Backblaze B2 Cloud Storage scales infinitely so you won’t have to worry about running out of space.

Now, back to the Colab notebook, hit the play button on Instance Images and click the button that shows up to Choose Files. In the pop up, choose your mounted instance of your B2 Bucket and select the photos.

Once they are uploaded, skip the Concept Images section and click the play button for Training. If you’ve done everything right, you should see some ASCII art like this:

Depending on how many photos you selected, this can take some time. So grab a coffee, go for a walk, listen to a podcast, or perhaps all three.

Creating Your Own AI-Generated Masterpieces

Once complete, click the Play button under Test the Trained Model. This will launch a temporary instance of Stable Diffusion with your new custom model in Gradio, which is an open-source Python library for running machine learning apps. Click the Gradio link generated and we’re ready to start making some AI art.

Again, there are a ton of options and configurations but all you really need to do at this point is enter some text into the Prompt box and click the big Generate button.

Creating prompts for AI art is quickly becoming its own art form. There are tons of resources out there to inspire you, but here are a few prompts I used along with the resulting art.

Pro Tip: You may need to click the Generate button a few times if something looks off. This is totally normal—your new AI friend is learning over time, and it does this by repeating the generation process.

Prompt: “Photo of troy digital painting”

Prompt: “Photo of troy person digital painting”

Prompt: “Photo of troy person digital painting asymmetrical headshot smiling”

And finally for something fun…

Prompt: ”photo of troy person hand-drawn cartoon”

It even has an artist signature! Although I’m not sure who fRny Y is?

So, there you have it. Your very own AI engine, customized to generate versions of your face (or your library of images).

Good luck to all the budding AI artists out there. If you give this a try, we’d love to see your images on social media. You can find us @backblaze on Twitter, Facebook, and LinkedIn. I look forward to seeing what you all create!

The post Stable Diffusion and Backblaze: Create a Masterpiece from a Bucket of Your Own Images appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Amazon Sunsets Cloud Drive

Post Syndicated from Stephanie Doyle original https://www.backblaze.com/blog/amazon-sunsets-cloud-drive/

Another one bites the dust. Amazon announced they’re putting Amazon Cloud Drive in the rearview to focus on Amazon Photos in a phased deprecation through December 2023. Today, we’ll dig into what this means for folks with data on Amazon Cloud Drive, especially those with files other than photos and videos.

Dear Amazon Drive User

When Amazon dropped the news, they explained the phased approach they would take to deprecating Amazon Drive. They’re not totally eliminating Drive—yet. Here’s what they’ve done so far, and what they plan to do moving forward:

  • October 31, 2022: Amazon removed the Drive app from iOS and Android app stores. The app doesn’t get bug fixes and security updates anymore.
  • January 31, 2023: Uploading to the Amazon Drive website will be cut off. You will have read-only access to your files.
  • December 31, 2023: Amazon Drive will no longer be supported and access to files will be cut off. Every file stored on Amazon Drive, except photo or video files, needs a new home. Users can access photo and video files on Amazon Photos.

Now, users face two options for what to do with files stored on Amazon Drive:

  1. Follow instructions to download Amazon Photos for iOS and Android devices. And, use the Amazon Drive website to download and store all other files locally or with another service.
  2. Transfer your entire library of photos, videos, and other data to another service.

Looking for an Amazon Cloud Drive Alternative?

Shameless plug: If you used Amazon Cloud Drive to store anything other than photos and you need a new place to keep your data, give Backblaze B2 Cloud Storage a try. The first 10GB are free, and our storage is priced at a flat rate of $5/TB/month ($0.005/GB/month) after that. And if you’re a business customer, we also offer the choice of capacity-based pricing with Backblaze B2 Reserve.

A Quick History of Amazon Cloud Drive

In 2014, Amazon offered free, unlimited photo storage on Amazon Cloud Drive as a loyalty perk for Prime members. The following year, they rolled out a subscription-based offering to store other types of files in addition to photos—video, documents, etc.—on Cloud Drive.

Then, in 2017, they capped the free tier at 5GB. This was just one of many in a string of cloud storage providers ending a free offering and forcing users to pay or move.

All Amazon account holders—regardless of whether they paid for Prime or not—got 5GB for photos and other file types free of charge. If you wanted or needed more storage than that, you had to sign up for the subscription-based offering starting at $11.99 per year for 100GB of storage, and prices went up from there.

You might consider this the beginning of the end for Amazon Cloud Drive.

Why Say Goodbye?

When tech companies deprecate a feature—as Amazon has done with Drive—it’s for any number of reasons:

  1. To combine one feature with another.
  2. To rectify naming inconsistencies.
  3. When a newer version makes supporting the older one impossible or impractical.
  4. To avoid flaws in a necessary feature.
  5. When a better alternative replaced the feature.
  6. To simplify the system as a whole.

Amazon’s reason for deprecating Drive? To provide a dedicated solution for photos and videos. The company stated, “We are taking the opportunity to more fully focus our efforts on Amazon Photos to provide customers a dedicated solution for photos and video storage.” Unfortunately, that leaves folks who store anything else high and dry.

Where Do We Go From Here?

The bottom line: Amazon Drive customers must park emails, documents, spreadsheets, PDFs, and text files somewhere else. If you’re an Amazon Drive customer looking to move your files out before you lose access, we invite you to try Backblaze B2. The first 10GB is on us.

How to Get Started with Backblaze B2

  1. If you’re not a customer, first sign up for B2 Cloud Storage.
  2. If you’re already a customer, enable B2 Cloud Storage in your “My Settings” tab. You can follow our Quick Start Guide for more detailed instructions.
  3. Download your data from Amazon Drive.
  4. Upload your data to Backblaze B2. Many customers choose to do so directly through the web interface, while others prefer to use integrated transfer solutions like Cyberduck, which is free and open-source, or Panic’s Transmit for Macs.
  5. Sit back and relax knowing your data is safely stored in the Backblaze B2 Storage Cloud.

The post Amazon Sunsets Cloud Drive appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

A Behind the Scenes Look at Our US East Data Center

Post Syndicated from Andy Klein original https://www.backblaze.com/blog/a-behind-the-scenes-look-at-our-us-east-data-center/

In the last couple of years, Backblaze has taken residence in three new data center facilities. In the West, we added the CyrusOne facility in Chandler, Arizona and the Nautilus Data Center in Stockton, California. In the East, we just added the Coresight facility in Reston, Virginia as our anchor to our new U.S. East data region. Each of these data centers will house over an exabyte of customer data. A number like an exabyte of storage is nice, and we’ll share a bunch more numbers as we go. But what I really wanted to share are the stories behind the scenes as we moved into the Coresight facility in Reston, which we call IAD 1.

The process of turning a big empty room into a data center location is complex and requires intense coordination, an adaptable project plan, and folks who can think on their feet. Let’s face it, no project is perfect, and building out a data center is no different. But, once in a while, it’s fun to peek behind the curtain to see how the actual work gets done. Let’s take a look.

Big Boxes, Tight Spaces, and Construction

There were over 700 boxes of various shapes and sizes to be received at the IAD site. This included server cabinets, storage servers, support servers, networking equipment, and other odds and ends. The cabinet and storage server boxes were particularly large, and only four of these boxes at a time could fit into the service elevator used to move everything from the loading dock to the Backblaze facility on the fourth floor.

Cabinets and servers are normally received almost daily, but the IAD data center was undergoing construction which limited where and how trucks could unload their cargo. One poor fellow spent the better part of three hours trying to manipulate his semi into the loading dock. The construction also limited the amount of cargo that could stay in the dock area to basically zero. When a truck arrived and started unloading Backblaze goods, our process was: load four boxes, go up four floors, unload four boxes, go down four floors, and repeat until the truck was empty.

Each time a truck arrived, the race was on, and there were a lot of trucks as everything was scheduled to arrive within a 30-day window. Why 30 days? We wanted to install the cabinets all at once, so we could run the networking once and not have to piecemeal it in. To help out, we enlisted the help of the Coresight staff to assemble and install the cabinets, while we ran the networking and installed the servers and other gear.

Taking the Long Way Home

The Backblaze presence is spread across two buildings. One building houses the data center itself and includes the maintenance area. The Backblaze office is located in another building. It’s not ideal, but sometimes things work out that way. But, what it means is this: If you don’t know where you are going you’re either not supposed to be there or you are new. Data centers are not known for having a lot of “you are here” signs, nor are there a lot of folks around to ask if you are lost.

Being new can turn a five minute walk from the office to the data center floor into a 30 minute expletive laden stroll through unmarked halls and deadend corridors complete with visions of serial killers being behind every door. Lucky for us, Coresight does background checks on all their employees.

Let’s Talk Boxes

While Backblaze was responsible for getting the boxes to the new space and unboxing the contents, Coresight helped out by providing us with some temporary storage space as we built out our facility. Given our aggressive schedule, things occasionally got messy (as seen below).

Shortly after this photo was taken, we learned the site had “the crusher”, which takes boxes or garbage or whatever and, well, crushes such things into dumpster-suitable or recycling-suitable packages. While not as fun as a Megabot, the crusher ensured that we didn’t lose any employees under an avalanche of boxes.

How many boxes? Well, there were 126 boxes containing cabinets, one per box. Each cabinet was assembled and installed by the Coresight folks. There were hundreds of smaller boxes containing networking servers, conduits, networking cables, and, of course, thousands of various types of cable ties used by the Backblaze cabling ninjas as we see below.

The cabinets are 52U tall, and 120 of them will be used to house 1,440 storage servers which will make up 72 Backblaze Vaults. Each vault consists of twenty storage servers. Each storage server has 60 16TB drives, which totals 960TB of raw storage per server. Doing the math, the IAD data center will have over 1.3EB of raw storage. Subtracting formatting and parity, the capacity is still over 1EB. Of course, over time we expect to use larger hard drives in all of our data centers as the cost per gigabyte for hard drives continues to decrease.

A Note on Parity

The IAD data center uses our own open-source Reed-Solomon erasure coding in a 16/4 data/parity scheme for storing data. This is our new normal when using 16TB drives and above, versus the 17/3 scheme used with smaller drives. This helps lessen the time it takes to recover from a failed drive in our farm.

My Kingdom for a Storage Pod

Not to be a villain here, but there are no Backblaze Storage Pods in the IAD 1 data center. All 100 of the storage servers used for the initial build out of the IAD data center are the Supermicro models we detailed in the recent Storage Pod Story blog post. You can see from the photo below each of the five vaults are racked and waiting for their hard drives to be installed. Maybe one day Supermicro will make us some pretty red bezels.

An Old Friend to the Rescue

The 52U Enconnex cabinets are 94.49 inches (2400 mm) tall. The 4U Supermicro storage servers will eventually be stacked 12 high in the cabinet, leaving 4U at the top of each for 1U core servers and IPMI (Intelligent Platform Management Interface) switches. Lifting a 4U 150lb (68kg) storage server is difficult, but so is lifting a 1U core server to nearly eight feet high. We needed some muscle, and there’s no one better than Guido, our first and most experienced server lift. He was flown in from the Phoenix data center to help the IAD staff get set up, and if he likes the gig he can stay. After all, he earned the right to a choice after nearly 12 years of heavy lifting for Backblaze.


The data center provides fully redundant power to each cabinet. Each separate power source connects to a PDU in each cabinet. In each cabinet, there is a red PDU and a blue PDU with each color representing a power source. Since most of the servers we use in the data center support redundant power, a given server connects to each PDU (red and blue) in their cabinet as shown below.

The PDUs that we used were recommended by the data center and not a brand we had used before. The PDU manufacturer does not make the power cables, but they do recommend a couple of brands. We happened to like our red and blue cables and used them instead. We were surprised to discover they were a bad fit and kept falling out of the PDUs—so much for standards. Amazingly, it just so happens that a company makes PDU plug locks to keep the plugs from falling out. The plug locks also help when someone accidentally bumps into a power plug connected to the PDU while working on some equipment, so there’s that.


As with all data center facilities, security is a prime concern. At the Coresite facility, Backblaze personnel must pass through a minimum of four checkpoints to get from the parking lot to the Backblaze data center facility or the Backblaze office. Along the way, both badge access and biometric scans are employed—sometimes separately and sometimes together. In addition, Backblaze personnel are limited in where they can go. For example, they are not allowed on the second and third floors of the data center building, only the fourth floor, and then they can only enter our facility. Getting lost while going from the office to the datacenter floor should make a little more sense now.

Within the Backblaze facility there are cameras that monitor everything inside. There are also cameras used by the Coresight staff to monitor the common areas such as hallways, the loading dock, and the parking lot. Before you can enter or leave the parking lot, an access badge and visual confirmation are required by the Coresight staff. This led to a very interesting dinner one evening for Backblaze and Coresight personnel…

Huevos Rancheros

Several of the Backblaze staff were temporarily deployed to Reston to set up the IAD data center. One of their favorite places to eat was Ted’s Bulletin, located in Reston near the data center. They serve breakfast all day until they close at 10 p.m. or so. Working into the evening is typical for data center set ups, and the gang decided to order from Ted’s and have it delivered via DoorDash so they could keep working.

The Dasher arrived with their order at the back gate of the compound. That’s not a public entrance and the Dasher was told to go around to the public gate. “This is where it says to go,” said the Dasher. He wasn’t even sure where he was; he just followed the GPS. Jack, who placed the DoorDash order, got a call from the Dasher. He was going to leave if someone didn’t meet him at the gate. Not wanting to see his huevos rancheros go to waste, Jack found his way to the back gate talking to the Dasher all the way so he wouldn’t leave. Jack showed his credentials to the security camera, but they would not open the gate. Why? The Dasher was a visitor at a non-visitor gate, and Jack was not a vehicle that needed to exit. The compromise; the Dasher was allowed to hand Jack the containers of food through a narrow opening in the gate. Jack showed his huevos rancheros and the other delights to security as he passed through the various checkpoints to get to a Backblaze office, and breakfast for dinner was had by all.

Three days later they wanted Ted’s again. They drove.


The Backblaze U.S. East data region is ready to go with five Backblaze Vaults online and accepting data. That’s 100 servers and thousands of connections open for business on day one with more vaults waiting in the wings to be deployed by the end of the year. Many thanks to Jack, Jessie, Zachary, Brent, Rich, Mark, and the supporting cast back at Backblaze HQ in San Mateo for getting IAD 1 up and running. Jessie and Zachary are part of the permanent crew at IAD 1 with more folks joining them over the coming months.

One last shoutout to the IAD crew for having the courage and sense of humor to share their stories with me. Having a Dasher squeezing your huevos through a gate in the dark while security folks watch on a live feed is not something I could ever make up. Thanks again.

Putting the New Region to Work for Your Business

With the addition of the new region, customers have more options for storing data and replicating datasets to separate cloud locations. Even better: Egress is free for Cloud Replication across the Backblaze platform. Go to our website for more information, check out our FAQ, and feel free to contact our Support Team if you have any questions.

The post A Behind the Scenes Look at Our US East Data Center appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Backblaze Adds US East Region, Expanding Location Choices and Cloud Replication Options

Post Syndicated from Tonya Comer original https://www.backblaze.com/blog/backblaze-adds-us-east-region-expanding-location-choices-and-cloud-replication-options/

Customers looking for more local availability and data resilience can get both with the opening of the U.S. East data region, now available to current and future Backblaze users. With an expanded data center footprint, customers can easily store replicated datasets to two or more cloud locations for compliance and continuity. Plus, data egress for Cloud Replication is free, so you can copy data at no expense across the Backblaze platform.

Data Regions Deliver Speed, Security, and Scalability

You can now select the U.S. East data region when you’re storing with Backblaze B2 Cloud Storage to:

  • Achieve redundancy in the cloud. Automatically replicate datasets across North America, whether it’s for compliance, protection from cyberattacks, continuity needs, or to keep data closer to users or customers. (We love a redundant backup plan.)
  • Deliver your data faster. Store data closer to end users to improve latency for primary data sets—especially important if you’re an East Coast-based company.
  • Scale sustainably. Increase or decrease your storage requirements as your business expands—no need to invest in additional hardware. And minimize costs associated with managing a data center, including hardware, software, support, and other costs.

To start storing data in U.S. East today, you can choose “Region: US East” when you create a Backblaze account.

Astonishingly Easy Cloud Replication

Backblaze’s multi-region cloud infrastructure allows you to further take advantage of Cloud Replication to improve reliability, accessibility, and overall fault tolerance. Even better: While other cloud providers charge you to replicate your data, there are no egress fees across the Backblaze platform for Cloud Replication.

It’s easy to get started. If you’re an existing customer, all you have to do to implement Cloud Replication is to log in to your B2 Storage Cloud account and click on Cloud Replication in the right-hand column. Go to our website for more information, check out our FAQ, and feel free to contact our Support Team if you have any questions.

New Data Region; Same Data Center Standards

Data stored in U.S. East will reside in Backblaze’s newest data center, IAD 1, located in Reston, Virginia. Backblaze has a high standard for our data centers, and this new facility is best-in-class. All Backblaze data centers are SSAE-18/SOC-2 compliant, use biometric security, and have ID checks and area locks that require badge-level access to keep your data safe. In addition to SOC 2 Type 2, this latest data center is ISO 27001, NIST 800-53, and HIPAA compliant.

Cloud Storage That Meets Evolving Needs

The way businesses use and access cloud storage is changing. Rather than relying on local storage, companies are increasingly turning to the cloud to meet their data storage needs, including data protection and redundancy. Opening our U.S. East data region is the next logical step to better serve our customers, now and in the future, as they increasingly adopt cloud-only infrastructures. And for the many customers who continue to store data on-premises, the new region gives them more choices for their backup needs as well.

Look out for Backblaze Evangelist, Andy Klein, to fill you in all the details of our newest data center in an upcoming blog post, and feel free to comment below if you want to know more.

The post Backblaze Adds US East Region, Expanding Location Choices and Cloud Replication Options appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Widening the Channel: Exertis Broadcast Adds Backblaze B2 Reserve

Post Syndicated from Elton Carneiro original https://www.backblaze.com/blog/widening-the-channel-exertis-broadcast-adds-backblaze-b2-reserve/

We launched our Channel Partner program about seven months ago. In the months since, we’ve rapidly onboarded some great strategic resellers, added new benefits, welcomed more staff to our team, and completed our initial launch of Backblaze B2 Reserve, our capacity-based cloud storage offering that includes download fees, premium support, and our Universal Data Migration service, exclusively for Backblaze resellers—but we’re still just getting started.

We’re very excited to announce another partner today.

Exertis Broadcast + Backblaze

Exertis Broadcast now offers resellers the full value and benefits of our Backblaze B2 Reserve program. This new partnership is doubly exciting to us because a number of our alliance partners already work with Exertis Broadcast—including Quantum, Studio Network Solutions (SNS), and SoDA—which means the world class Exertis engineers can package a suite of best-in-breed cloud workflow solutions in one seamless package for teams working in media and entertainment, modern data protection, and/or disaster recovery solutions industries.

If you’re a reseller looking for a distribution partner that can help your customers with their cloud storage needs, here are a few of the benefits Exertis offers:

  • Sales and Support dedicated to customer success.
  • Engineering Team available to consult on the best products and solutions to fit any needs.
  • Tools and Resources ranging from a state-of-the-art demo center to an innovative video solution builder.
  • Video Production to create cutting-edge content.
  • Marketing Professionals to design effective marketing content to keep you abreast of industry news and events.

To get started, resellers can contact us at [email protected] today.

The Backblaze Channel Partner Program

The Channel Partner program exists to provide easy, transparent, predictable cloud storage solutions to accelerate growth for resellers through the value of our Backblaze B2 Reserve offering.

The program provides benefits ranging from deal registration to joint marketing; rewards like seller incentives and market development funds (coming soon); as well as support including a Partner Portal and sales and marketing staff assistance.

Join Us!

We can’t wait to join with our current and future Channel Partners to deliver tomorrow’s solutions to any customer who can use astonishingly easy cloud storage. (We think that’s pretty much everybody.)

If you’re a reseller, we’d love to hear from you. If you’re a customer interested in benefiting from any of the above, we’d love to connect you with the right Channel Partner team to serve your needs. Either way, the doors are open and we look forward to helping out.

The post Widening the Channel: Exertis Broadcast Adds Backblaze B2 Reserve appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

10 Holiday Security Tips for Your Business

Post Syndicated from Stephanie Doyle original https://www.backblaze.com/blog/10-holiday-security-tips-for-your-business/

The very last thing you want to be dealing with as a business owner over the holiday season is a security breach. Unfortunately, cybercriminals take advantage of the fact that businesses are closed or short staffed. Workers are inundated with end-of-year tasks and generally have increased responsibilities outside of work. On top of that, email traffic is up. All of the distractions make it easier for cyberattackers to exploit security vulnerabilities. The increased risk around the holidays is all the more reason companies should put a little extra effort in to ensure their data is safe.

There are plenty of simple strategies that business owners and managers can enforce to improve business security during the holidays. Cybersecurity threats can occur at any time, so it’s crucial that your business is prepared during busier times of year. Small businesses are no exception. By planning ahead and putting security systems in place, you can protect your organization and enjoy the holiday season instead of scrambling to respond to a security incident. Today, we’ll cover 10 tips to help you strengthen your security stance during the holidays this year.

Disaster Recovery: What You Need to Know About Server Backup

Whether you’re concerned about the rise in ransomware attacks or creating a disaster-proof recovery plan, Backblaze’s new Server Backup Ebook can walk you through best practices to back up your servers. Protect your business critical data with backup strategies for on-premises, cloud-only, and hybrid backups.

➔ Learn How to Protect the Data on Your Servers

Holiday Cybercrime

The main cause of vulnerabilities at this time of year is human error, meaning that many holiday data breaches are completely preventable. A shortage of workers, increased distractions, and reduced focus on security is the perfect opportunity for cybercriminals to strike.

Last year, the FBI urged organizations to be prepared for attacks executed on weekends and holidays when there is less staff available to respond to vulnerabilities and threats. When workers are out of the office, their duties are sometimes given to other employees who may not have the experience necessary to do their job and keep cybersecurity at top of mind. And, many workers take time off to celebrate with friends, family, and loved ones around the holidays.

Workers are also likely to feel a bit of end-of-year fatigue. Employees are working diligently to tie up loose ends and ensure that everything is in order as the fiscal year ends. However, that means their attention won’t be as focused on cybersecurity, and small details might fall through the cracks.

Because of the festivities, many organizations choose to shut their doors for an extended period of time. But this leaves organizations even more vulnerable to a cyberattack since there is no one online to monitor and detect anomalous activity on the network.

All of these factors converge to create the perfect opportunity for cybercriminals to launch an attack.

10 Security Tips for Your Business This Holiday Season

Keeping your business safe from damaging cyberattacks and organized crime during the holiday season should be your top priority. Here are 10 security tips that every business can follow to help protect their business during the holidays.

1. Update Security Patches

Teams should ensure that their systems are up to date and that any new patches are tested and applied as soon as they are released, no matter how busy the company is at this time. Additionally, personnel should be assigned to monitor alerts remotely when the business is closed or workers are out of the office so that critical patches aren’t delayed.

2. Require Workers to Set Up Their Multifactor Authentication Credentials

Multifactor Authentication (MFA) fatigue happens when workers get tired of logging in and out with an authenticator app, push notification, or with a text message. During the holidays, workers might be busier than usual, and therefore, more frustrated by MFA requirements. But MFA is crucial for keeping your business safe from ransomware and DDoS attacks. Now is the perfect time for workers to set up their MFA credentials so that systems are extra secure going into the holidays.

3. Conduct Phishing Simulation Training

Another important step that organizations can take to ensure security over the holidays is to conduct phishing simulation training at the beginning of the season, and ideally on a monthly basis. This kind of training gives employees a chance to practice their ability to identify malicious links and attachments without a real threat looming. It’s a good opportunity to teach workers not to share login information with anyone over email and the importance of verifying emails.

4. Review Your Company Security Policy With All of Your Employees

All businesses should review company security policies as the holiday season approaches. Ensure that all employees understand the importance of keeping access credentials private, know how to spot cybercrime, and know what to do if a crime happens. Whether your staff is in-office or remote, all employees should be up to date on security policies and special holiday circumstances.

5. Unplug All Unnecessary Devices

Businesses require many tools and technologies to run effectively, so it’s easy to leave everything running 24/7. However, leaving devices plugged in and powered on makes them a target of opportunity for hackers. This is especially important if a location will be closed for an extended period. When it’s time to shut down for the holidays, unplug anything that isn’t required to keep your business up and running to reduce your overall risk.

6. Adjust Property Access Privileges

You might be surprised to know that physical security is a cybercrime prevention tool as well. Doors and devices should be the most highly protected areas of your space. Before the holidays, be sure to do a thorough review of your business’ access privileges so that no one has more access than is necessary to perform their duties. And before shutting down for a much-needed break, check all exterior doors, windows, and other entry points to ensure they are fully secured. Don’t forget to update any automated systems to keep everything locked down before your return to work.

7. Don’t Advertise That You Will Be Closed

It’s common practice to alert customers when your business will be closed so that you can avoid any inconvenience. However, this practice could put your business at risk during times of the year when the crime rate is elevated, including the holiday season. Instead of posting signage or on social media declaring that no one will be in the building for a certain period, it’s better to use an automated voice or email response to alert customers of your closing. This way, crime opportunists will be less tempted.

8. Invest In Employee Safety Training

In addition to reviewing company security policies, businesses should also invest in employee safety training. This is also one of the most important things you can do when first hiring an employee. If an intruder breaches the network, your credentials are compromised, or someone accidentally clicks on a malicious link, workers need to know who to notify and what steps to take to mitigate risks.

9. Make Sure You Have a Solid Backup Strategy

The industry standard is the 3-2-1 backup strategy. A 3-2-1 strategy means having at least three total copies of your data, two of which are local but on different media, and at least one off-site copy (in the cloud).You should also have basic cybersecurity protocols, like multi-factor authentication, firewalls, and email encryption, in place.

10. Test Your Disaster Recovery Strategy

If you don’t have a disaster recovery strategy, this is the time to create one. If you do have one, this is also a great time to put it to the test. You should know going into the holidays that you can respond quickly and effectively should your company suffer a security breach.

Protecting Business Data During the Holidays

While it may be impossible to prevent all instances of data theft and cybercrime from happening, there are steps that companies can take to protect themselves.
These 10 business security tips will help you discover areas of your business that could use more protection, give you new ideas about preventing threats, and help empower your employees to have a safe and happy holiday season.

The post 10 Holiday Security Tips for Your Business appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Object Lock 101: Protecting Data From Ransomware

Post Syndicated from Molly Clancy original https://www.backblaze.com/blog/object-lock-101-protecting-data-from-ransomware/

Cybercriminals are good at what they do. It’s an unfortunate reality, but one that you should be prepared for if you are in charge of keeping data safe. A study of penetration testing projects from Positive Technologies found that, “In 93% of cases, an external attacker can breach an organization’s network perimeter and gain access to local network resources.”

With this knowledge, smart companies prepare in advance rather than hoping to avoid being attacked. Recovering from a ransomware attack is much easier when you maintain safe, reliable backups—especially if you implement a 3-2-1 backup strategy. But even with a strong backup strategy in place, you’re not fully protected. Anything that’s connected to a compromised network is vulnerable, including backups. Cybercriminals are savvy, and they’ve shown they can target backups to gain leverage and force companies to pay—something that’s increasingly going to put you on the wrong side of the law.

That doesn’t have to be your story. With advances in backup protection like Object Lock, you can add one more layer of defense between cybercriminals and your valuable, irreplaceable data.

In this post, we’ll explain:

  • What Object Lock is.
  • What Object Lock does.
  • Why you should use it.
  • When you should use it.

More On Protecting Your Business From Ransomware Attacks

This post is a part of our ongoing series on ransomware. Take a look at our other posts for more information on how businesses can defend themselves against a ransomware attack, the latest patterns in ransomware attacks, and more.

➔ Download The Complete Guide to Ransomware

What Is Object Lock?

Object Lock is a powerful backup protection tool that prevents a file from being altered or deleted until a given date. When you set the lock, you can specify the length of time an object should be locked. Any attempts to manipulate, copy, encrypt, change, or delete the file will fail during that time. (NOTE: At Backblaze, the Object Lock feature was previously referred to as “File Lock,” and you may see the term from time to time in documentation. They are one and the same.)

Reminder: What Is an Object?

An object is a unit of data that contains all of the bytes that constitute what you would typically think of as a file. That file could be an image, video, document, audio recording, etc. An object also includes metadata so that it can be easily analyzed.

What Does Object Lock Do?

Object Lock allows you to store objects using a Write Once, Read Many (WORM) model, meaning after it’s written, data cannot be modified or deleted for a defined period of time. The files may be accessed, but no one can change them, including the file owner or whoever set the Object Lock.

What is Object Lock Legal Hold?

Object Lock Legal Hold also prevents a file from being changed or deleted, but the lock does not have a defined retention period—a file is immutable until Object Lock Legal Hold is removed.

What Is an Air Gap, and How Does Object Lock Provide One?

Object Lock creates a virtual air gap for your data. The term comes from the world of LTO tape. When backups are written to tape, the tapes are then physically removed from the network, creating a gap of air between backups and production systems. In the event of a ransomware attack, you can just pull the tapes from the previous day to restore systems.

Object Lock does the same thing, but it all happens in the cloud. Instead of physically isolating data, Object Lock virtually isolates the data.

What Is Immutable Data? Is It the Same as Object Lock?

In object storage, immutability is a characteristic of an object that cannot be modified or changed. It is different from Object Lock in that Object Lock is a function offered by object storage providers that allows you to create immutable or unchangeable objects. Immutability is the characteristic you want to achieve, and Object Lock is the way you achieve it.

How Does Object Lock Work With Veeam Ransomware Protection?

Veeam, a backup software provider, offers immutability as a feature to protect your data. The immutability feature in Veeam works hand-in-hand with the Object Lock functionality offered by cloud providers like Backblaze. If you’re using a cloud storage provider to store backups and they support Object Lock (which we think all should, not that we’re biased), you can configure your backup software to save your immutable backups to a storage bucket with Object Lock enabled. As a certified Veeam Ready-Object and Veeam Ready-Object with Immutability partner, utilizing this feature with Backblaze is as simple as checking a box in your settings.

For a step-by-step video on how to back up Veeam to Backblaze B2 Cloud Storage with Object Lock functionality, check out the video below.

Does Object Lock Work With Other Integrations?

Object Lock works with many Backblaze B2 integrations in addition to Veeam, including MSP360, Commvault, Rubrik, and more. You can also enable Object Lock using the Backblaze S3 Compatible API, the B2 Native API, the Backblaze B2 SDKs, and the CLI.

Why Should You Use Object Lock?

Using Object Lock to protect your data means no one—not cybercriminals, not ransomware viruses, not even you—can edit or delete your files. If your systems are compromised by ransomware, you can trust that your backup data stored with Object Lock hasn’t been deleted or altered. There’s no added cost to use Object Lock with Backblaze B2 beyond what you would pay to store the data anyway (but other cloud providers charge for Object Lock, so you should be sure to check fees when comparing cloud storage providers).

Finally, data security experts strongly recommend using Object Lock to protect your critical backups. Not only is it recommended, but in some industries Object Lock is necessary to maintain data protection standards required by compliance agencies. One other thing to consider: Many companies are adopting cyber insurance, and often those companies require immutable backups for you to be fully covered.

The question really isn’t, “Why should you use Object Lock?” but rather “Why aren’t you?”

When Should You Use Object Lock?

The immutability achieved by Object Lock is useful for protecting against ransomware, but there are some additional use cases that make it valuable to businesses as well.

What Are the Different Use Cases for Object Lock?

Object Lock comes in handy in a few different use cases:

  1. To replace an LTO tape system: Most folks looking to migrate from tape are concerned about maintaining the security of the air gap that tape provides. With Object Lock you can create a backup that’s just as secure as air-gapped tape without the need for expensive physical infrastructure.
  2. To protect and retain sensitive data: If you work in an industry subject to HIPAA regulations or if you need to retain and protect data for legal reasons, Object Lock allows you to easily set appropriate retention periods for regulatory compliance.
  3. As part of a disaster recovery and business continuity plan: The last thing you want to worry about in the event you are attacked by ransomware is whether your backups are safe. Being able to restore systems from backups stored with Object Lock can help you minimize downtime and interruptions, comply with cybersecurity insurance requirements, and achieve recovery time objectives easier.

Protecting Your Data With Object Lock

To summarize, here are a few key points to remember about Object Lock:

  • Object Lock creates a virtual air gap using a WORM model.
  • Data that is protected using Object Lock is immutable, meaning it’s unchangeable.
  • With Object Lock enabled, no one can encrypt, tamper with, or delete your locked data.
  • Object Lock can be used to replace tapes, protect sensitive data, and defend against ransomware.

Ransomware attacks can be disruptive, but your story doesn’t have to end with you feeling forced to pay against your better judgment or facing extended downtime. As cybercriminals become bolder and more advanced, creating immutable, air-gapped backups using Object Lock functionality puts a manageable recovery in closer reach.

Have questions about Object Lock functionality and ransomware? Let us know in the comments.

The post Object Lock 101: Protecting Data From Ransomware appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Why Cyberattacks Surge During the Holiday Season

Post Syndicated from Molly Clancy original https://www.backblaze.com/blog/why-cyberattacks-surge-during-the-holiday-season/

The holiday season should be all about spending some much-needed time off with friends and family, not dealing with cyberattacks at work. But the holiday season is the most wonderful time of year for cybercriminals, too. Cyberattacks surge between Thanksgiving and New Year’s. Many businesses and workers may be too busy or distracted to check every security alert or look over every email for suspicious content.

All businesses should be aware of cybersecurity risks during the holiday season, but small and medium sized businesses face different challenges when it comes to cyberattacks compared with large enterprises. Small businesses (with fewer than 500 employees) comprise 99.9% of all businesses in the United States. And microbusinesses, or businesses with four or fewer employees, comprise 91%. Due to their staffing and budget constraints, it is likely they are more vulnerable to cyberattacks than larger organizations.

Let’s take a closer look at why the holidays are so dangerous when it comes to digital security, and how you can prepare your business for a holiday cyberattack and retain your holiday cheer.

Download our Ransomware Guide

There’s never been a better time to strengthen your ransomware defenses. Get our comprehensive guide to defending your business against ransomware this holiday season.➔ Download The Complete Guide to Ransomware

The Most Vulnerable Time of the Year

So, why do cybercriminals choose the holiday season to perform their most damaging attacks? Here are a few reasons:

1. Companies Are Short-Staffed

Many companies find themselves short-staffed during the peak of the holiday season. Between holiday travel, events, and obligations, it’s easier for things to fall through the cracks. No matter how much you plan to have a full staff, there will always be times when you wish you had more personnel. End-of-year planning, increased order volumes, more time spent performing customer service duties, and technology hiccups keep staff more than busy at this time of year. Not to mention that there’s an added burden on IT professionals during the holidays, who are busy trying to keep office networks and remote access safe and secure, responding to help tickets, and keeping an eye on increased anomalous activity.

2. Workers Are Distracted

When employees are spread thin and juggling numerous duties and holiday obligations, office duties often take a back seat. Employees are looking forward to the holidays just as much as you are, so you can imagine that they might be more inattentive than at less festive times of the year. Workers that are distracted from their normal cybersecurity awareness might miss a clue that an email is coming from an illegitimate source.

Cybersecurity activities include scanning for vulnerabilities, mitigating risks, and looking for bad actors moving through systems. Among the hustle and bustle of the holidays, it might seem like there is no time for cybersecurity, or that it can wait till next year. That’s exactly why cybercriminals will be waiting to launch their attack when you least expect it.

Just a little office gift wrapping.

3. Email Activity Increases

With so many “happy holidays” emails from vendors, internal employees, and even outside addresses, there are plenty of opportunities for a fraudster to plant a malicious link that goes unnoticed. If a worker falls for a scam on a company device, the entire company could be at risk for a malware attack.

Cybersecurity Risks During the Holiday Season

Ransomware is one of the most damaging threats to businesses of all kinds. Last year there was a 30% increase in ransomware attacks targeting companies during the holiday season. When a worker unknowingly clicks on a malicious link or accesses a hijacked website on a company device, the business may become infected with ransomware. Attackers can then hold the organization for ransom by threatening to leak information. The advice is generally to refuse to pay.

Whether your company is in finance, retail, logistics, or any other industry, the first step to getting prepared for the holiday season is to reevaluate your cybersecurity. Ensure that you are ready in case one of these cybersecurity risks hits you this year.


Phishing is a popular attack vector that cybercriminals use to gain access to a company’s system. Phishing emails can be very convincing when they impersonate another organization or legitimate person to trick the receiver into divulging crucial login information.

While many people think they would be able to recognize a phishing email, they’re the entry point for 90% of data breaches. Plus, busy workers may not have the time to focus on the minute details of every message they receive this holiday season. Attackers will use that to their advantage.

A phishing email recently received by the author that came from a false sender address.

Distributed Denial of Service (DDoS) Attacks

Another serious threat to business during the holidays is a DDoS attack. This is an especially popular route for cyberattacks at this time of year. Why? Simply put: Because businesses are busy, and attackers are keen to take advantage of that distraction to launch an attack. Cybercriminals use DDoS attacks to overload business systems with so much traffic that none of your applications can function.

Compromised Passwords

The best way for a cybercriminal to gain access to your business websites, accounts, and other mission-critical apps is to obtain compromised credentials. There are many ways that fraudsters can attempt to steal company login credentials with minimal effort. In fact, there have been several well-publicized password-related breaches that made passwords available to anyone who cares to search for that information—people have even created APIs so that you can easily see if you’re affected by those breaches. We humans are also prone to reusing passwords. According to a 2022 report, employees admitted to reusing passwords across an average of 16 different workplace accounts.

Protect Your Business This Holiday Season

So, what can you do to minimize your risks as cybercriminals ramp up their attacks? Here are some tips to help protect your business this holiday season:

  • Ensure your anti-virus and/or anti-phishing software scans for vulnerabilities regularly.
  • Discuss phishing email best practices with your staff year-round, but especially during the holiday season.
  • Never click on suspicious links or download email attachments from unknown senders.
  • Turn on safe browsing capabilities in your browser.
  • Backup business data locally and to the cloud.
  • Update your software and apply patches when they are released.
  • Use strong passwords, multi-factor authentication, and a secure password manager to generate and store secure passwords.

Even if you’ve done everything right, there is still a chance that you could be outsmarted by a cybercriminal this holiday season. Every business, no matter how big or small, needs to have an incident response plan in place to help staff identify the breach before it’s too late.

Don’t forget to include thorough training on the specific security protocols that workers need to follow in the event that a cyberattack does occur. If your business becomes the victim of a cyberattack, the sooner you can identify the breach, the better.

And just in case the worst happens, it’s smart to invest in a reliable backup solution. A decentralized approach to data security can help protect your business and safeguard your private information from anyone who wants to take advantage of your company. If your systems do go down and a cybercriminal locks you out of your business applications, you will still have your backup data, which means that you can restore your business data and resume business as usual with as little disruption as possible.

The holiday season is a money-maker for businesses and cybercriminals alike. Make sure that your company is protected so you can focus on the joy of the season instead of giving cybercriminals an easy payday.

The post Why Cyberattacks Surge During the Holiday Season appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Hard Drive Cost Per Gigabyte

Post Syndicated from Andy Klein original https://www.backblaze.com/blog/hard-drive-cost-per-gigabyte/

“A penny for your thoughts” is a common boomer expression (so says my 23 year old). Instead, how about “a penny for a gigabyte of hard disk”? That’s new, tech-y, and, well, it’s almost true.

When Backblaze purchased hard drives back in 2009, we paid over $0.11 per gigabyte. In 2017, when we did a review of the cost of hard drives, the cost per gigabyte had fallen to just below $0.03 per gigabyte. Today, we can get 16TB hard drives for about $0.014 per gigabyte on average. That’s not quite a penny, but we think we’ll get there soon enough. In the meantime, let’s look at our hard drive purchases over the years and see what we can learn about the cost per gigabyte of hard drive storage.

How Many Drives?

Backblaze has purchase records going back to 2009. In that time, we’ve purchased 265,332 hard drives. We even recorded each of the hundreds of hard drives purchased during our drive farming days in 2011 and our crowdsourcing days in 2012. Thank you, Cecilia! The 265,322 number is not precise—some drives were purchased before 2009, some drives were purchased and never put into service, and, occasionally, we received a small number of test drives from manufacturers. Still, the 265,332 number is close. The breakdown in drives purchased by drive size is shown below.

The 16TB drives are the only drive size we are currently purchasing. It is possible we could purchase a small group of spares for one of the other drive sizes. Unlikely at this point, but possible. In addition, the 16TB count does not include 12,000 drives purchased and scheduled for delivery over the next few months.

In 2023 we expect to qualify 18TB, 20TB, and potentially 22TB drives. Of course, we will wait a bit before purchasing in bulk to ensure the qualified drive models are stable over time and the price per gigabyte meets our expectations. Speaking of expectations, look for those drives to show up in the quarterly Drive Stats reports starting about mid-2023.

Drive Type

All of the drives we purchase use Perpendicular Magnetic Recording (PMR), also known as Conventional Magnetic Recording (CMR). We do not use Shingled Magnetic Recording (SMR) drives. SMR drives are sometimes less expensive but are demonstrably slower with random writes and when they are reusing space made available from previous file deletes. In most data backup use cases, where variable length writes and file deletions are the norm, the minimal cost savings of SMR drives is negated by the storage inefficiencies introduced by having to perform multiple writes to store data in tracks where data had been previously deleted.

The Cost Per Gigabyte

As noted earlier, our hard drive cost per gigabyte had fallen to a little over $0.03 by the end of 2017. At that point 8TB drives were our primary drives. Over the next few years we added 12TB, 14TB, and 16TB drives to the mix and the average cost per gigabyte continued to decrease as shown in the chart below.

From 2017 to November 2022, the average cost per gigabyte decreased by 56.36% for all of the drives ($0.033 down to $0.0144). That’s over 9% per year on average across the four drive sizes. To put that data in context, below is the complete chart from 2009 through November 2022 for all drives sizes we have used as data drives during that period.

You can observe the overall down and to the right trend over the period, although the 3TB and 4TB drives make that drop messy. This was due primarily to the Thailand drive crisis which began in the second half of 2011 and continued to affect the market into 2013 before things got back to normal.

Overall, the drop in the average price per gigabyte was from $0.114 in 2009 to just $0.014 as of November 2022. That’s a difference of $0.100 (one thin dime) over the period. That equates to an 87.4% decrease in the average cost per terabyte since 2009. If we calculate the average decrease per month over that period, we get the cost per gigabyte of the hard drives we use decreasing 0.52% per month since January 2009.

During that time drive technology hardly stood still as drive manufacturers crammed more platters in the same basic 3 ½ inch chassis, dramatically increased areal media density, figured out how to use helium inside the drives, started using glass substrates for their platters, and other improvements and innovations. Regardless of what you may think of a given drive manufacturer, that’s pretty awesome for the industry as a whole.

At this point, a fair question would be why the cost we charge for storage hasn’t decreased 87% since 2009? Our friends at IDC (source: Figure 10; IDC Thought Leadership Practice Case Study) have calculated that in 2009, there was about 0.3 zettabytes of data stored on hard drives worldwide, and they estimated that by the end of 2022 there would be 1.8 zettabytes. That’s an increase of 500% for the amount of data stored on hard drives over the period. Let’s just say the global population is storing a lot more data and leave it at that.

Dollars and Sense

Of course, you don’t buy hard drives using percentages, you use dollars (or pesos, or pounds, or euros, and so on.) Let’s take a minute to see what price you would have to pay versus what we have paid. We’ve listed the best street price we could find for some of the 12TB, 14TB, and 16TB drives we use.


  1. As Western Digital continues to assimilate the HGST drive business, the model numbers of these drives are changing to WDC standards.
  2. This model is sold as a server-based drive, similar models such as the MG07ACA14TE are less expensive.

Remember, we buy drives in bulk quantity and on contract with guaranteed pricing, delivery dates, and such. It’s not quite the same thing as buying a drive from a shrink wrapped pallet at Costco or from a Cyber Monday deal on Amazon. You may be able to buy a couple of drives at a great price, but when you need 12,000 drives delivered to your front door on a certain date, Amazon Prime isn’t going to cut it. As a result, it may cost a dollar or two more per drive to ensure we have what we need when we need it.

Lessons Learned

  • The cost per gigabyte has continued to fall over the past 13 years we’ve been tracking our drive purchases. This was in spite of the Thailand drive crisis which started in 2011, as well as the Coronavirus and the continuing supply chain problems it caused.
  • Drive manufacturer consolidation hasn’t stopped the cost per gigabyte from decreasing from 2009 through 2022. That said, it is impossible to say what the cost per gigabyte would be without consolidation.
  • On average, the cost per gigabyte of a drive will fall on average about 0.5% per month over time, slowly at first, then accelerating for some period before bottoming out.
  • In nearly every case, the cost per gigabyte of each new drive size introduced will eventually fall below that of its predecessor. For example, the cost per gigabyte for a 14TB drive will be less than a 12TB, and the cost per gigabyte for the 16TB drive will be less than the 14TB.

Where Is the Bottom?

When we published our 2017 report on this topic, we proclaimed the race to the bottom was over, implying that the cost per gigabyte could not go much (if any) lower. We were wrong.

So where is the bottom? There’s an expression that goes something like, “We have done so much for so long with so little that we can now do practically anything with nothing.” There are probably folks at the drive manufacturers mumbling that expression to themselves on a daily basis as they try to cram more bits in less space on increasingly thin sheets of coated glass racing by at 5,400/7,200/15,000 revolutions per minute.

Getting back to reality, the next milestone we can see is $0.01 per gigabyte for a hard drive—that’s not a sale price, but a stable street price. Let’s go out on a limb and say that we will reach that in mid-2025 with 22TB or 24TB drives. That would mean you could buy a 22TB drive at Costco or on Amazon for about $220, or a 24TB for $240.

Is $0.01 per gigabyte the bottom? At the risk of having boomers shout “How low can you go?” and throw their backs out doing the limbo, we’ll ask: How low can the cost per gigabyte go? Tell us what you think.

The post Hard Drive Cost Per Gigabyte appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Education Unplugged: Google Ends Unlimited Storage for Schools

Post Syndicated from Barry Kaufman original https://www.backblaze.com/blog/education-unplugged-google-ends-unlimited-storage-for-schools/

For schools and universities, data storage is paramount. Staff, administrators, and educators, not to mention students, need a secure place to store files. Add to that the legacy accounts of alumni storing irreplaceable files from their education, and you have a massive need for storage.

For a long time, Google was happy to oblige. In 2006, the company launched Google Apps for Education (later G Suite for Education; now Google Workplace for Education), offering free unlimited storage for qualifying schools and districts. But when they’d reached market penetration—somewhere in the neighborhood of 83% of school districts according to EdWeek Research Center—they ended the unlimited storage policy many schools had come to rely on.

If you already know about Google’s policy change and are looking for a solution to save your data and your budget, getting started with Backblaze B2 is easy. Otherwise, read on to learn more about the change, what it may mean for you in the long-term, and a Backblaze partnership with Carahsoft that eases purchasing through local, state, and federal buying programs.

Office Hours Are Over—Google Ends Unlimited Storage for Educational Institutions

Google’s policy change took effect in July 2022, and many schools and universities had to find alternative storage solutions or change their internal storage policies to stay within the new limits. Under the terms of the new policy, Google offers a baseline of 100TB of pooled storage shared across all users.

The policy shift was spurred, Google says, because “as we’ve grown to serve more schools and universities each year, storage consumption has also rapidly accelerated. Storage is not being consumed equitably across—nor within—institutions, and school leaders often don’t have the tools they need to manage this.”

For some school districts, colleges, and universities, this policy shift meant having to reach out to alumni with the request that they back up all their own data. It also hit some already-strapped IT budgets particularly hard. Estimates vary, but depending on the size of the school and their data needs, they could be looking at anywhere up to an extra $70,000 a year in storage costs.

That’s a non-negligible fee for a service that has become increasingly vital for schools. We’ve written about how important cloud storage is for schools, but it’s worth reiterating here.

School is in Session

Not only will a secure cloud storage solution help protect school districts from threats of ransomware, it can also help maintain predictable operating expenses and create opportunities for collaboration through remote learning. In cases like Kansas’ Pittsburg State University, it helped keep data safe from natural disasters that abound in places like Tornado Alley. Pittsburg State implemented Backblaze B2 as their off-site backup in the event of disaster and used Object Lock functionality to safeguard data from ransomware.

Photo Credit: Pittsburg State University

The academic world is still adjusting to Google’s policy change. Stories have emerged of schools simply dropping Google and being forced to move data out of thousands of alumni accounts. A quick-fix solution to avoid Google’s new fee structure, this strategy is being undertaken without a clear answer to the question of how alumni can access their own data after the move. After all, how up to date are those alumni email lists?

A Google Alternative for Schools

School districts, colleges, and universities need to find a new, budget-friendly way forward. If you’re still struggling to find an alternative storage solution now that the bell has rung and Google has dismissed its free storage, Backblaze can help you find a new home on the cloud.

Backblaze B2 offers schools unlimited, pay-as-you-go storage at a fraction of the price of Google, enabling you to continue offering students and alumni the storage space they’ve come to expect. For colleges, universities, and school districts not buying through government purchasing programs, you can sign up for Backblaze B2 directly. We offer 10TB of storage free so that you can see if it works for you, but if you want to do a larger or customized proof of concept, reach out to our Sales team.

Accessing Backblaze Through Your Local, State, or Federal Buying Program

As we revealed during this year’s Educause conference, Backblaze has recently rolled out a partnership with Carahsoft aimed squarely at budget-conscious educational institutions. The partnership brings Backblaze services to educational institutions with a capacity-based pricing model that’s a fraction of the price of traditional cloud providers like Google. And it can be purchased through local, state, or federal buying programs. If you buy IT services for your district through a distributor, this solution could work for you. Visit the partnership announcement to learn more.

The post Education Unplugged: Google Ends Unlimited Storage for Schools appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Automate Your Digital Media Workflows with Backblaze and Telestream

Post Syndicated from Elton Carneiro original https://www.backblaze.com/blog/automate-your-digital-media-workflows-with-backblaze-and-telestream/

Streamlining your digital media workflow can make all the difference when it comes to your productivity—not to mention your budget. For folks in media, entertainment, post-production, corporate, education, government, or content creation, media workflows just got a little easier thanks to a partnership between Backblaze and Telestream.

Now joint customers can store transcoded media files on Backblaze B2 Cloud Storage as their origin store for delivery via Telestream’s Vantage CloudPort product. Read on to learn more about the partnership and what it means for you.

What Does Telestream Do?

Telestream, a Backblaze alliance partner, specializes in products that make it possible to get video content to any audience regardless of how it is created, distributed, or viewed. Throughout the entire digital media lifecycle, from capture to viewing, for consumers through high-end professionals, Telestream products range from desktop components and cross-platform applications to fully-automated, enterprise-class digital media transcoding, and workflow systems. Telestream enables users in a broad range of business environments to leverage the value of their video content.

How Does This Partnership Benefit Joint Customers?

Content is king, as they say, and being able to efficiently and effectively produce content and make it available to the audiences that are going to consume it is critical. This partnership benefits joint customers in a few key ways:

  • Customers can benefit from cost savings in the cloud: Backblaze is a fraction of the cost of diversified cloud providers.
  • By storing transcoded media files in the cloud, customers can leverage other services like QC in the cloud to ensure quality is up to their high standards.
  • Customers can leave on-premises storage in the past and move to the cloud to leverage the cloud’s infinite scalability and parallelism.
  • Making the move to the cloud also reduces the risk of having a single point of failure on premises.

“Telestream and Backblaze are driven by a shared mission to empower our customers and help them make their businesses more efficient. With this collaboration, we can meet our customers where their content is stored and apply Telestream’s best in class media processing tools.”
—Tim MacGregor, Senior Director, Head of Strategy and Product Development, Telestream Cloud

Getting Started With Backblaze B2 and Telestream

Ready to do more with your data affordably? Check out the Telestream documentation for connecting storage via the generic S3 protocol, and contact our Sales team today to get started.

The post Automate Your Digital Media Workflows with Backblaze and Telestream appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Backblaze Launches Comprehensive Partner Program

Post Syndicated from Elton Carneiro original https://www.backblaze.com/blog/backblaze-launches-comprehensive-partner-program/

Support from our partners is part of what makes Backblaze so easy to use for so many folks, and today we’re continuing our efforts to make working with us even easier with the launch of our new Partner Program.

For businesses, it cuts through the complexity and cost that may have stopped them from adopting cloud storage and backup. For partners—including resellers, integrators, managed service providers, and more—it boosts their array of cloud solutions and brings even more value to their clients. The program builds on our long commitment to develop new solutions for partners and help them grow their businesses.

Partner Program Offerings

The program provides new opportunities for four key partner groups: Channel Partners, Technology Partners, Managed Service Providers (MSPs), and Affiliates.

As part of this new program, Channel Partners can take advantage of special capacity-based pricing with B2 Reserve, as well as a self-service resource providing discounts, deal registration, and in-house support. It also offers training and education resources.

Technology Partners can enjoy complimentary solution expertise and joint go-to-market and co-branding opportunities. MSPs will notice the ease of the new admin console and the utility of in-house support, digital assets, training materials, and data sheets, not to mention the recurring 10% commissions on computer backup. And of course, Affiliates, too, can enjoy recurring 10% commissions.

With the launch of the program, Backblaze is doubling down on its commitment to its partners, proving why Backblaze has built its reputation on easy-to-use, affordable cloud storage.

“Ease of use and accessibility can have a significant impact for our partners and their business. We are continuously looking for ways to innovate and develop for our partners. Offering this easy, accessible, and efficient resource will strengthen our relationship with our customers.”
—Nilay Patel, Vice President of Sales, Backblaze

Visit our Partner page to learn more about the Partner Program, visit the Partner Portal, or get started as a new partner.

The post Backblaze Launches Comprehensive Partner Program appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Querying a Decade of Drive Stats Data

Post Syndicated from Pat Patterson original https://www.backblaze.com/blog/querying-a-decade-of-drive-stats-data/

Last week, we published Backblaze Drive Stats for Q3 2022, sharing the metrics we’ve gathered on our fleet of over 230,000 hard drives. In this blog post, I’ll explain how we’re now using the Trino open source SQL query engine in ensuring the integrity of Drive Stats data, and how we plan to use Trino in future to generate the Drive Stats result set for publication.

Converting Zipped CSV Files into Parquet

In his blog post Storing and Querying Analytical Data in Backblaze B2, my colleague Greg Hamer explained how we started using Trino to analyze Drive Stats data earlier this year. We quickly discovered that formatting the data set as Apache Parquet minimized the amount of data that Trino needed to download from Backblaze B2 Cloud Storage to process queries, resulting in a dramatic improvement in query performance over the original CSV-formatted data.

As Greg mentioned in the earlier post, Drive Stats data is published quarterly to Backblaze B2 as a single .zip file containing a CSV file for each day of the quarter. Each CSV file contains a record for each drive that was operational on that day (see this list of the fields in each record).

When Greg and I started working with the Parquet-formatted Drive Stats data, we took a simple, but somewhat inefficient, approach to converting the data from zipped CSV to Parquet:

  • Download the existing zip files to local storage.
  • Unzip them.
  • Run a Python script to read the CSV files and write Parquet-formatted data back to local storage.
  • Upload the Parquet files to Backblaze B2.

We were keen to automate this process, so we reworked the script to use the Python ZipFile module to read the zipped CSV data directly from its Backblaze B2 Bucket and write Parquet back to another bucket. We’ve shared the script in this GitHub gist.

After running the script, the drivestats table now contains data up until the end of Q3 2022:

trino:ds> SELECT DISTINCT year, month, day 
FROM drivestats ORDER BY year DESC, month DESC, day DESC LIMIT 1;
year | month | day 
 2022 |     9 |  30 
(1 row)

In the last article, we were working with data running until the end of Q1 2022. On March 31, 2022, the Drive Stats dataset comprised 296 million records, and there were 211,732 drives in operation. Let’s see what the current situation is:

trino:ds> SELECT COUNT(*) FROM drivestats;
(1 row) 

trino:ds> SELECT COUNT(*) FROM drivestats 
    WHERE year = 2022 AND month = 9 AND day = 30;
(1 row)

So, since the end of March, we’ve added 50 million rows to the dataset, and Backblaze is now spinning nearly 231,000 drives—over 19,000 more than at the end of March 2022. Put another way, we’ve added more than 100 drives per day to the Backblaze Cloud Storage Platform in the past six months. Finally, how many exabytes of raw data storage does Backblaze now manage?

trino:ds> SELECT ROUND(SUM(CAST(capacity_bytes AS bigint))/1e+18, 2)
FROM drivestats WHERE year = 2022 AND month = 9 AND day = 30;
(1 row)

Will we cross the three exabyte mark this year? Stay tuned to find out.

Ensuring the Integrity of Drive Stats Data

As Andy Klein, the Drive Stats supremo, collates each quarter’s data, he looks for instances of healthy drives being removed and then returned to service. This can happen for a variety of operational reasons, but it shows up in the data as the drive having failed, then later revived. This subset of data shows the phenomenon:

trino:ds> SELECT year, month, day, failure FROM drivestats WHERE 
serial_number = 'ZHZ4VLNV' AND year >= 2021 ORDER BY year, month, 
 year | month | day | failure 
 2021 |    12 |  26 |       0 
 2021 |    12 |  27 |       0 
 2021 |    12 |  28 |       0 
 2021 |    12 |  29 |       1 
 2022 |     1 |   3 |       0 
 2022 |     1 |   4 |       0 
 2022 |     1 |   5 |       0 

This drive appears to have failed on Dec 29, 2021, but was returned to service on Jan 3, 2022.

Since these spurious “failures” would skew the reliability statistics, Andy searches for and removes them from each quarter’s data. However, even Andy can’t see into the future, so, when a drive is taken offline at the end of one quarter and then returned to service in the next quarter, as in the above case, there is a bit of a manual process to find anomalies and clean up past data.

With the entire dataset in a single location, we can now write a SQL query to find drives that were removed, then returned to service, no matter when it occurred. Let’s build that query up in stages.

We start by finding the serial numbers and failure dates for each drive failure:

trino:ds> SELECT serial_number, DATE(FORMAT('%04d-%02d-%02d', year, 
month, day)) AS date 
FROM drivestats 
WHERE failure = 1;
  serial_number  |    date    
 ZHZ3KMX4        | 2021-04-01 
 ZA12RBBM        | 2021-04-01 
 S300Z52X        | 2017-03-01 
 Z3051FWK        | 2017-03-01 
 Z304JQAE        | 2017-03-02 
(17092 rows)

Now we find the most recent record for each drive:

trino:ds> SELECT serial_number, MAX(DATE(FORMAT('%04d-%02d-%02d', 
year, month, day))) AS date
    FROM drivestats 
    GROUP BY serial_number;
  serial_number   |    date    
 ZHZ65F2W         | 2022-09-30 
 ZLW0GQ82         | 2022-09-30 
 ZLW0GQ86         | 2022-09-30 
 Z8A0A057F97G     | 2022-09-30 
 ZHZ62XAR         | 2022-09-30 
(329908 rows)

We then join the two result sets to find spurious failures; that is, failures where the drive was later returned to service. Note the join condition—we select records whose serial numbers match and where the most recent record is later than the failure:

trino:ds> SELECT f.serial_number, f.failure_date
    SELECT serial_number, DATE(FORMAT('%04d-%02d-%02d', year, month, 
day)) AS failure_date
    FROM drivestats 
    WHERE failure = 1
) AS f
    SELECT serial_number, MAX(DATE(FORMAT('%04d-%02d-%02d', year, 
month, day))) AS last_date
    FROM drivestats 
    GROUP BY serial_number
) AS l
ON f.serial_number = l.serial_number AND l.last_date > f.failure_date;
  serial_number  | failure_date 
 2003261ED34D    | 2022-06-09 
 W300STQ5        | 2022-06-11 
 ZHZ61JMQ        | 2022-06-17 
 ZHZ4VL2P        | 2022-06-21 
 WD-WX31A2464044 | 2015-06-23 
(864 rows)

As you can see, the current schema makes date comparisons a little awkward, pointing the way to optimizing the schema by adding a DATE-typed column to the existing year, month, and day. This kind of denormalization is common in analytical data.

Calculating the Quarterly Failure Rates

In calculating failure rates per drive model for each quarter, Andy loads the quarter’s data into MySQL and defines a set of views. We additionally define the current_quarter view to restrict the failure rate calculation to data in July, August, and September 2022:

CREATE VIEW current_quarter AS 
    SELECT * FROM drivestats
    WHERE year = 2022 AND month in (7, 8, 9);

CREATE VIEW drive_days AS 
    SELECT model, COUNT(*) AS drive_days 
    FROM current_quarter
    GROUP BY model;

    SELECT model, COUNT(*) AS failures
    FROM current_quarter
    WHERE failure = 1
    GROUP BY model
    SELECT DISTINCT(model), 0 AS failures
    FROM current_quarter
    WHERE model NOT IN
        SELECT model
        FROM current_quarter
        WHERE failure = 1
        GROUP BY model

CREATE VIEW failure_rates AS
    SELECT drive_days.model AS model,
           drive_days.drive_days AS drive_days,
           failures.failures AS failures, 
           100.0 * (1.0 * failures) / (drive_days / 365.0) AS 
    FROM drive_days, failures
    WHERE drive_days.model = failures.model;

Running the above statements in Trino, then querying the failure_rates view, yields a superset of the data that we published in the Q3 2022 Drive Stats report. The difference is that this result set includes drives that Andy excludes from the Drive Stats report: SSD boot drives, drives that were used for testing purposes, and drive models which did not have at least 60 drives in service:

trino:ds> SELECT * FROM failure_rates ORDER BY model;
        model         | drive_days | failures | annual_failure_rate 
 CT250MX500SSD1       |      32171 |        2 |                2.27 
 DELLBOSS VD          |      33706 |        0 |                0.00 
 HGST HDS5C4040ALE630 |       2389 |        0 |                0.00 
 HGST HDS724040ALE640 |         92 |        0 |                0.00 
 HGST HMS5C4040ALE640 |     341509 |        3 |                0.32 
 WDC WD60EFRX         |        276 |        0 |                0.00 
 WDC WDS250G2B0A      |       3867 |        0 |                0.00 
 WDC WUH721414ALE6L4  |     765990 |        5 |                0.24 
 WDC WUH721816ALE6L0  |     242954 |        0 |                0.00 
 WDC WUH721816ALE6L4  |     308630 |        6 |                0.71 
(74 rows)

Query 20221102_010612_00022_qscbi, FINISHED, 1 node
Splits: 139 total, 139 done (100.00%)
8.63 [82.4M rows, 5.29MB] [9.54M rows/s, 628KB/s]

Optimizing the Drive Stats Production Process

Now that we have shown that we can derive the required statistics by querying the Parquet-formatted data with Trino, we can streamline the Drive Stats process. Starting with the Q4 2022 report, rather than wrangling each quarter’s data with a mixture of tools on his laptop, Andy will use Trino to both clean up the raw data and produce the Drive Stats result set for publication.

Accessing the Drive Stats Parquet Dataset

When Greg and I started experimenting with Trino, our starting point was Brian Olsen’s Trino Getting Started GitHub repository, in particular, the Hive connector over MinIO file storage tutorial. Since MinIO and Backblaze B2 both have S3-compatible APIs, it was easy to adapt the tutorial’s configuration to target the Drive Stats data in Backblaze B2, and Brian was kind enough to accept my contribution of a new tutorial showing how to use the Hive connector over Backblaze B2 Cloud Storage. This tutorial will get you started using Trino with data stored in Backblaze B2 Buckets, and includes a section on accessing the Drive Stats dataset.

You might be interested to know that Backblaze is sponsoring this year’s Trino Summit, taking place virtually and in person in San Francisco, on November 10. Registration is free; if you do attend, come say hi to Greg and me at the Backblaze booth and see Trino in action, querying data stored in Backblaze B2.

The post Querying a Decade of Drive Stats Data appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Backblaze Drive Stats for Q3 2022

Post Syndicated from Andy Klein original https://www.backblaze.com/blog/backblaze-drive-stats-for-q3-2022/

As of the end of Q3 2022, Backblaze was monitoring 230,897 hard drives and SSDs in our data centers around the world. Of that number, 4,200 are boot drives, with 2,778 SSDs and 1,422 HDDs. The SSDs were previously covered in our recently published Midyear SSD Report. Today, we’ll focus on the 226,697 data drives under management as we review their quarterly and lifetime failure rates as of the end of Q3 2022.

We’ll also take a look at the relationship between hard drive failure rates and hard drive cost. Along the way, we’ll share our observations and insights on the data presented, and, as always, we look forward to you doing the same in the comments section at the end of the post.

Q3 2022 Hard Drive Failure Rates

Let’s start with reviewing our data for the Q3 2022 period. In that quarter, we tracked 226,697 hard drives used to store data. For our evaluation, we removed 388 drives from consideration as they were used for testing purposes or drive models which did not have at least 60 drives. This leaves us with 226,309 hard drives grouped into 29 different models to analyze.

Notes and Observations on the Q2 2022 Stats

Zero failures for Q3: Three drives had zero failures this quarter: the 8TB HGST (model: HUH728080ALE604), the 8TB Seagate (model: ST8000NM000A), and the 16TB WDC (model: WUH721816ALE6L0). For the 8TB HGST, that was the second quarter in a row with zero failures. Of the three, only the WDC model has enough lifetime data (drive days) to be comfortable with the calculated annualized failure rate (AFR). As we will see later in this review, this 14TB WDC model has a lifetime AFR of 0.11% with the confidence interval range of just 0.30 at a 95% confidence level.

The new disks in town: There are two new models in this quarter’s data: the 8TB Seagate (model: ST8000NM000A) and the 16TB Seagate (model: ST16000NM002J). Neither has enough data to be interesting yet, but as noted above, the 8TB Seagate had zero failures in its first quarter in operation. These additions give us 29 different models we are tracking, up from 27 in the previous quarter.

The 29 models break down by manufacturer as:

  • HGST: 7 models
  • Seagate: 13 models
  • Toshiba: 6 models
  • WDC: 3 models

The chart below shows, by manufacturer, how our drive fleet has changed over the past six years.

The old guard is feeling old: All three of the oldest drives we currently use are showing signs of their age as each experienced an increase in AFR from Q2 to Q3 2022 as shown below.

MFG Model Size Q3 2022 Avg Age Q2 AFR Q3 AFR
Seagate ST4000DM000 4TB 83.1 3.42% 4.38%
Seagate ST6000DX000 6TB 89.6 0.91% 1.34%
TOSHIBA MD04ABA400V 4TB 88.3 0.00% 8.25%

Note that the 4TB Toshiba only had two failures in Q3 2022. The high AFR (8.25%) is due to the limited number of drive days in the quarter (8,849) from only 95 drives. For all three, it seems their spindles, actuators, and media are starting to wear out after seven years or so of constant spinning.

The Quarterly AFR continues to rise: The AFR for Q3 2022 was 1.64%, increasing from 1.46% in Q2 2022 and from 1.10% a year ago. As noted previously, this is related to the aging of the entire drive fleet and we would expect this number to go down as older drives are retired and replaced over the next year. A possible harbinger of what is to come can be seen in the 16TB models which as a group had an 0.80% AFR in Q3 2022. As these drives are used to replace the aging 4TB drives, the quarterly AFR should decrease.

Hard Drive Failure Versus Hard Drive Cost

One question that comes up is why we would continue to buy a drive model that has a higher annualized failure rate versus a comparably sized, but more expensive, model. Two primary reasons: First, we are able to do so as our cloud storage Backblaze Vault architecture is designed for drive failure. Second, by studying data like drive stats and such, we work hard to understand our environment from the inside out. Understanding the relationship between cost and drive failure is one of those learnings. Here’s a simple example below using three fictitious models of 14TB drives, Model 1, Model 2, and Model 3.

Let’s take a look at the different sections (i.e. blue rows) of this table.

Drive Cost: Each model has a different price: low ($225), medium ($250), and high ($275). We would buy the same number of drives (5,000) of each model and we get the cost of each model.

Annual Drive Failures: This is the AFR of each drive model. For this example, we assigned the lowest price model to the highest failure rate, the highest price model to the lowest failure rate, and so on. In practice, we would use our own AFR numbers for a given model that we are considering purchasing. Regardless, we get the annual number of failed drives for each model.

Annual Replacement Cost: Labor cost covers the human cost involved from identifying the failure to returning and replacing the drive. Drive cost is zero here as the assumption is that all drives are returned for credit or replacement to the manufacturer or their agent. A zero value here may not always be the case; hence the line item. In either case, the annual cost to replace the failed drives for each model is computed.

Lifetime Replacement Cost: Take the number of years you expect the drive model to be in service times the annual cost to replace the failed drives. All of this gets us the total cost of each drive model—the peach section. In our example, the most expensive model (Model 3) is the most expensive drive over the five-year life expectancy and the lowest cost drive model (Model 1) is the least expensive over the same period, even with a higher annualized failure rate.

But we’re not done. The next question is: What would the annualized failure rate for the least expensive choice, Model 1, need to be such that the total cost after five years would be the same as Model 2 and then Model 3? In other words, how much failure can we tolerate before our original purchase decision is wrong? When we crunch the numbers we come out with the following:

  • Model 1 and Model 2 have the same total drive cost ($1,325,000) when the annualized failure rate for Model 1 is 2.67%.
  • Model 1 and Model 3 have the same total drive cost ($1,412,500) when the annualized failure rate for Model 1 is 3.83%.

The model presented is a simplified version of how we think about drive purchase decisions using annualized drive failure rates as part of the equation. You can make this model more accurate, and complicated, by adding in the drive failure rate changes over time (the bathtub curve) and prorating the cost of returning failed drives over the years. Whether that is needed is up to you.

The need for such a model is important in our business if you are interested in optimizing the efficiency of your cloud storage platform. Otherwise, just robotically buying the most expensive, or least expensive, drives is turning a blind eye to the expense side of the ledger.

On an individual or small office/home office level, your drive purchasing decision requires a lot less math, and often comes down to what drive can you afford. Even so, you should still try to do some research. Our drive stats can help, but in all cases you should have a solid backup plan in place as no drive you can buy is failure proof.

Lifetime Hard Drive Failure Rates

As of September 30, 2022, Backblaze was monitoring 226,697 hard drives used to store data. For our evaluation, we removed 388 drives from consideration as they were used for testing purposes or drive models which did not have at least 60 drives. This leaves us with 226,309 hard drives grouped into 29 different models to analyze for the lifetime report.

Notes and Observations About the Lifetime Stats

The lifetime annualized failure rate for all the drives listed above is 1.41%. That is a slight increase from the previous quarter of 1.39%, but lower than one year ago (Q3 2021) which was 1.45%.

The usual caution should be applied to those drive models that have wide confidence intervals, one percent or greater. Such a gap indicates there is not enough data or that the data we do have is not readily predictable.

That said, we do have plenty of drive models for which we have solid data. Below we’ve extracted the 12TB, 14TB, and 16TB models from the lifetime table above that have a Lifetime AFR of less than 1% and have a confidence interval of 0.5% or less. These are hard drives which, up to this point, have shown solid reliability in our environment.

The Hard Drive Stats Data

The complete data set used to create the information in this review is available on our Hard Drive Test Data page. You can download and use this data for free for your own purpose. All we ask are three things: 1) you cite Backblaze as the source if you use the data, 2) you accept that you are solely responsible for how you use the data, and 3) you do not sell this data to anyone; it is free.

If you want the tables and charts used in this report, you can download the .zip file from Backblaze B2 Cloud Storage which contains the .jpg and/or .xlsx files as applicable.

Good luck, and let us know if you find anything interesting.

The post Backblaze Drive Stats for Q3 2022 appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.