Tag Archives: Featured-Cloud Storage

Exabyte Unlocked

Post Syndicated from Ahin Thomas original https://www.backblaze.com/blog/exabyte-unlocked/

Backblaze Reaches an Exabyte of Customer Data Stored

If you’re interested in Backblaze’s COVID-19 response, Gleb Budman, our CEO, shared a message with our community here.

With the impact of coronavirus on all of our lives, it’s been a struggle to find pieces of good news to share. But we wanted to take a break from the usual programming and share a milestone we’re excited about, one that’s more than 12 years in the making.

Since the beginning of Backblaze—back in 2007, when our five co-founders were working out of Brian Wilson’s apartment in Palo Alto—watching the business grow has always been profoundly exciting.

Our team has grown. From five, way back in the Palo Alto days, to 145 today. Our customer base has grown. Today, we have customers in over 160 countries… it’s not so long ago that we were excited about having our 160th customer.

More than anything else, the data we manage for our customers has grown.

In 2008, not long after our launch, we had 750 customers and thought ten terabytes was a lot of data. But things progressed quickly, and just two years later we reached 10 petabytes of customer data stored (1,000x more). (Good thing we designed for zettabyte-scale cloud architecture!)

By 2014, we were storing 100 petabytes—the equivalent of 11,415 years of HD video.

Years passed, our team grew, the number of customers grew, and—especially after we launched B2 Cloud Storage in 2015—the data grew. At some scale it got harder to contextualize what hundreds and hundreds of petabytes really meant. We like to remember that each byte is part of some individual’s beloved family photos or some organization’s critical data that they’ve entrusted us to protect.

That belief is part of every single Backblaze job description. Here’s how we put it in that context:

Our customers use our services so they can pursue dreams like curing cancer (genome mapping is data-intensive), archive the work of some of the greatest artists on the planet (learn more about how Austin City Limits uses B2), or simply sleep well at night (anyone that’s spilled a cup of coffee on a laptop knows the relief that comes with complete, secure backups).”

It’s critically important for us that we achieved this growth by staying the same in the most important ways: being open & transparent, building a sustainable business, and caring about being good to our customers, partners, community, and team. That’s why I’m excited to announce a huge milestone today—our biggest growth number yet.

We’ve reached 1.

Or, by another measurement, we’ve reached 1,000,000,000,000,000,000.

Yes, today, we’re announcing that we are storing 1 exabyte of customer data.

What does it all mean? Well. If you ask our engineers, not much. They’ve already rocketed past this number mentally and are considering how long it will take to get to a zettabyte (1,000,000,000,000,000,000,000 bytes).

But, while it’s great to keep our eyes on the future, it’s also important to celebrate what milestones mean. Yes, crossing an exabyte of data is another validation of our technology and our sustainably independent business model. But I think it really means that we’re providing value and earning the trust of our customers.

Thank you for putting your trust in us by keeping some of your bytes with us. Particularly in times like these, we know that being able to count on your infrastructure is essential. We’re proud to serve you.

As the world grapples with a pandemic, celebrations seem inappropriate. But we did want to take a moment and share this milestone with you, both for those of you who have been with us over the long haul and in the hopes that it provides a welcome distraction. To that end, we’ve been working on a few things that we’d planned to launch in the coming weeks. We’ve made the decision to push forward with those launches in hopes that the tools may be of some use for you (and, if nothing else, to try to do our part to provide a little entertainment). For today, here’s to our biggest 1 yet. And many more to come.

Interested in learning more about how we got here? Check out the recent profile of Backblaze in Inc. magazine, free to our blog readers.

The post Exabyte Unlocked appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Defining an Exabyte

Post Syndicated from Patrick Thomas original https://www.backblaze.com/blog/what-is-an-exabyte/

Defining an Exabyte

What is an Exabyte?

An exabyte is made up of bytes, which themselves are units of digital storage. A byte is made up of 8 bits. A bit—short for “binary digit”—is a single unit of data. Namely a 1, or a 0.

The International System of Units (SI) denotes “exa” as a multiplication by the sixth power of 1000 or (1018).

In other words, 1 exabyte (EB) = 1018bytes = 1,0006bytes = 1000000000000000000 bytes = 1,000 petabytes = 1 million terabytes = 1 billion gigabytes. Overwhelmed by numbers yet?

Why don’t we give you some examples of what these numbers actually look like? We created this infographic to help put it in perspective.

How Big is an Exabyte?

Share this Image On Your Site

Please include attribution to Backblaze.com with this graphic.

Interested in learning more about how we got here? Check out the recent profile of Backblaze in Inc. magazine, free to our blog readers.

The Road to an Exabyte of Cloud Storage

So now that you know what an exabyte looks like, let’s look at how Backblaze got there.

Way back in 2010, we had 10 petabytes of customer data under management. It was a big deal for us, it took us two years to accomplish and, more importantly, it was a sign that thousands of customers trusted us with their data.

It meant a lot! But when we decided to tell the world about it, we had a hard time quantifying just how big 10 petabytes were, so naturally we made an infographic.

10 Petabytes Visualized

That’s a lot of hard drives. A Burj Khalifa of drives, in fact.

In what felt like the blink of an eye, it was two years later, and we had 75 petabytes of data. The Burj was out. And, because it was 2013, we quantified that amount of data like this…

At 3MB per song, Backblaze would store 25 billion songs.

Pop songs now average around 3:30 in length, which means if you tried to listen to this imaginary musical archive, it would take you 167,000 years. And sadly, the total number of recorded songs is only the tens to hundreds of millions, so you’d have some repeats.

That’s a lot of songs! But more importantly, our data under management had grown by 750%! But we could barely take time to enjoy it because five months later we hit 100 petabytes, and we had to call it out. Stacking up to the Burj Khalifa was in the past! Now, we rivaled Mt. Shasta…

Stacked on end they would be 9,941 feet, about the same height as Mt. Shasta from the base.

But stacking drives was rapidly becoming less effective as a measurement. Simply put, the comparison was no longer apples to apples: the 3,000 drives we stacked up in 2010 only held one terabyte of data. If you were to take those same 3,000 drives and use the average drive size we had in 2013, about 4 terabytes of data per drive, the size of the stack would stay the same, as hard drives had not physically grown, but the density of the storage inside the drives had grown by 400%.

Regardless, the years went by, we launched an award-winning cloud storage service (Backblaze B2), and the incoming petabytes kept on accelerating—150 petabytes in early 2015, 200 before we reached 2016. Around there, we decided we needed to wait until the next big moment, and in February 2018, we hit 500 petabytes.

It took us two years to store 10 petabytes of data.

Over the next 7 years, by 2018, we stored another 500 petabytes.

And today, we reset the clock, because in the last two years, we’ve added another 500 petabytes. Which means we’re turning the clock back to 1…

1 exabyte.

Today, across 125,000 hard drives, Backblaze is managing an exabyte of customer data.

And what does that mean? Well, you should ask Ahin.

The post Defining an Exabyte appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Backblaze Hard Drive Stats for 2019

Post Syndicated from Andy Klein original https://www.backblaze.com/blog/hard-drive-stats-for-2019/

Hard Drive Stats Report 2019

Hard Drive Stats for 2019

As of December 31, 2019, Backblaze had 124,956 spinning hard drives. Of that number, there were 2,229 boot drives and 122,658 data drives. This review looks at the hard drive failure rates for the data drive models in operation in our data centers. In addition, we’ll take a look at how our 12 and 14 TB drives are doing and get a look at the new 16 TB drives we started using in Q4. Along the way we’ll share observations and insights on the data presented and we look forward to you doing the same in the comments.

2019 Hard Drive Failure Rates

At the end of 2019 Backblaze was monitoring 122,658 hard drives used to store data. For our evaluation we remove from consideration those drives that were used for testing purposes and those drive models for which we did not have at least 5,000 drive days during Q4 (see notes and observations for why). This leaves us with 122,507 hard drives. The table below covers what happened in 2019.

2019 Annualized Hard Drive Failure Rates by make and manufacturer

Notes and Observations

There were 151 drives (122,658 minus 122,507) that were not included in the list above. These drives were either used for testing or did not have at least 5,000 drive days during Q4 of 2019. The 5,000 drive-day limit removes those drive models where we only have a limited number of drives working a limited number of days during the period of observation. NOTE: The data for all drives, data drives, boot drives, etc., is available for download on the Hard Drive Test Data webpage.

The only drive model not to have a failure during 2019 was the 4 TB Toshiba, model: MD04ABA400V. That’s very good, but the data sample is still somewhat small. For example, if there had been just 1 (one) drive failure during the year, the Annualized Failure Rate (AFR) for that Toshiba model would be 0.92%—still excellent, not 0%.

The Toshiba 14 TB drive, model MG07ACA14TA, is performing very well at a 0.65% AFR, similar to the rates put up by the HGST drives. For their part, the Seagate 6 TB and 10 TB drive continue to be solid performers with annualized failure rates of 0.96% and 1.00% respectively.

The AFR for 2019 for all drive models was 1.89% which is much higher than 2018. We’ll discuss that later in this review.

Beyond the 2019 Chart—“Hidden” Drive Models

There are a handful of drive models that didn’t make it to the 2019 chart because they hadn’t recorded enough drive-days in operation. We wanted to take a few minutes to shed some light on these drive models and where they are going in our environment.

Seagate 16 TB Drives

In Q4 2019 we started qualifying Seagate 16 TB drives, model: ST16000NM001G. As of the end of Q4 we had 40 (forty) drives in operation, with a total of 1,440 drive days—well below our 5,000 drive day threshold for Q4, so they didn’t make the 2019 chart. There have been 0 (zero) failures through Q4, making the AFR 0%, a good start for any drive. Assuming they continue to pass our drive qualification process, they will be used in the 12 TB migration project and to add capacity as needed in 2020.

Toshiba 8 TB Drives

In Q4 2019 there were 20 (twenty) Toshiba 8 TB drives, model: HDWF180. These drives have been installed for nearly two years. In Q4, they only had 1,840 drive days, below the reporting threshold, but lifetime they do have 13,994 drive days with only 1 drive failure, giving us an AFR of 2.6%. We like these drives, but by the time they were available to us in quantity, we could buy 12 TB drives at the same cost per TB. More density, same price. Given we are moving to 16 TB drives and beyond, we most likely will not be buying any of these drives in the future.

HGST 10 TB Drives

There are 20 (twenty) HGST 10 TB drives, model: HUH721010ALE600 in the operation. These drives have been in service a little over one year. They reside in the same Backblaze Vault as the Seagate 10 TB drives. The HGST drives recorded only 1,840 drive days in Q4 and a total of 8,042 since being installed. There have been 0 (zero) failures. As with the Toshiba 8 TB, purchasing more of these 10 TB drives is unlikely.

Toshiba 16 TB Drives

You won’t find these in the Q4 stats, but in Q1 2020 we added 20 (twenty) Toshiba 16 TB drives, model: MG08ACA16TA. They have logged a total of 100 drive days, so it is way too early to say anything other than more to come in the Q1 2020 report.

Comparing Hard Drive Stats for 2017, 2018, and 2019

The chart below compares the Annualized Failure Rates (AFR) for each of the last three years. The data for each year is inclusive of that year only and for the drive models present at the end of each year.

Annualized Hard Drive Failure Rates by Year - 2017-2019

The Rising AFR in 2019

The total AFR for 2019 rose significantly in 2019. About 75% of the different drive models experienced a rise in AFR from 2018 to 2019. There are two primary drivers behind this rise. First, the 8 TB drives as a group seem to be having a mid-life crisis as they get older, with each model exhibiting their highest failure rates recorded. While none of the rates is cause for worry, they contribute roughly one fourth (1/4) of the drive days to the total, so any rise in their failure rate will affect the total. The second factor is the Seagate 12 TB drives, this issue is being aggressively addressed by the 12 TB migration project reported on previously.

The Migration Slows, but Growth Doesn’t

In 2019 we added 17,729 net new drives. In 2018, a majority of the 14,255 drives added were due to migration. In 2019, less than half of the new drives were for migration with the rest being used for new systems. In 2019 we decommissioned 8,800 drives totaling 37 Petabytes of storage and replaced them with 8,800 drives, all 12 TB, totaling about 105 Petabytes of storage, then we added an additional 181 Petabytes of storage in 2019 using 12 TB and 14 TB drives.

Drive Diversity

Manufacturer diversity across drive brands increased slightly in 2019. In 2018, Seagate drives were 78.15% of the drives in operation, by the end of 2019 that percentage had decreased to 73.28%. HGST went from 20.77% in 2018, to 23.69% in 2019, and Toshiba increased form 1.34% in 2018 to 3.03% in 2019. There were no Western Digital branded drives in the data center in 2019, but as WDC rebrands the newer large-capacity HGST drives, we’ll adjust our numbers accordingly.

Lifetime Hard Drive Stats

While comparing the annual failure rates of hard drives over multiple years is a great way to spot trends, we also look at the lifetime annualized failure rates of our hard drives. The chart below shows the annualized failure rates of all of the drives models in production as of 12/31/2019.

Annualized Hard Drive Failure Rates for Active Drives - 4/20/2013 - 12/31/2019

The Hard Drive Stats Data

The complete data set used to create the information used in this review is available on our Hard Drive Test Data page. You can download and use this data for free for your own purposes. All we ask are three things: 1) you cite Backblaze as the source if you use the data, 2) you accept that you are solely responsible for how you use the data, and 3) you do not sell this data to anyone; it is free.

If you just want the summarized data used to create the tables and charts in this blog post, you can download the ZIP file containing the CSV files for each chart.

Good luck and let us know if you find anything interesting.

The post Backblaze Hard Drive Stats for 2019 appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Backblaze Hard Drive Stats Q3 2019

Post Syndicated from Andy Klein original https://www.backblaze.com/blog/backblaze-hard-drive-stats-q3-2019/

Backblaze Drive Stats Q3 2019

As of September 30, 2019, Backblaze had 115,151 spinning hard drives spread across four data centers on two continents. Of that number, there were 2,098 boot drives and 113,053 data drives. We’ll look at the lifetime hard drive failure rates of the data drive models currently in operation in our data centers, but first we’ll cover the events that occurred in Q3 that potentially affected the drive stats for that period. As always, we’ll publish the data we use in these reports on our Hard Drive Test Data web page and we look forward to your comments.

Hard Drive Stats for Q3 2019

At this point in prior hard drive stats reports we would reveal the quarterly hard drive stats table. This time we are only going to present the Lifetime Hard Drive Failure table, which you can see if you jump to the end of this report. For the Q3 table, the data which we typically use to create that report may have been indirectly affected by one of our utility programs which performs data integrity checks. While we don’t believe the long-term data is impacted, we felt you should know. Below, we will dig into the particulars in an attempt to explain what happened in Q3 and what we think it all means.

What is a Drive Failure?

Over the years we have stated that a drive failure occurs when a drive stops spinning, won’t stay as a member of a RAID array, or demonstrates continuous degradation over time as informed by SMART stats and other system checks. For example, a drive that reports a rapidly increasing or egregious number of media read errors is a candidate for being replaced as a failed drive. These types of errors are usually seen in the SMART stats we record as non-zero values for SMART 197 and 198 which log the discovery and correctability of bad disk sectors, typically due to media errors. We monitor other SMART stats as well, but these two are the most relevant to this discussion.

What might not be obvious is that changes in some SMART attributes only occur when specific actions occur. Using SMART 197 and 198 as examples again, these values are only affected when a read or write operation occurs on a disk sector whose media is damaged or otherwise won’t allow the operation. In short, SMART stats 197 and 198 that have a value of zero today will not change unless a bad sector is encountered during normal disk operations. These two SMART stats don’t cause read and writes to occur, they only log aberrant behavior from those operations.

Protecting Stored Data

When a file, or group of files, arrives at a Backblaze data center, the file is divided into pieces we call shards. For more information on how shards are created and used in the Backblaze architecture, please refer to Backblaze Vault and Backblaze Erasure Coding blog posts. For simplicity’s sake, let’s say a shard is a blob of data that resides on a disk in our system.

As each shard is stored on a hard drive, we create and store a one-way hash of the contents. For reasons ranging from media damage to bit rot to gamma rays, we check the integrity of these shards regularly by recomputing the hash and comparing it to the stored value. To recompute the shard hash value, a utility known as a shard integrity check reads the data in the shard. If there is an inconsistency between the newly computed and the stored hash values, we rebuild the shard using the other shards as described in the Backblaze Vault blog post.

Shard Integrity Checks

The shard integrity check utility runs as a utility task on each Storage Pod. In late June, we decided to increase the rate of the shard integrity checks across the data farm to cause the checks to run as often as possible on a given drive while still maintaining the drive’s performance. We increased the frequency of the shard integrity checks to account for the growing number of larger-capacity drives that had been deployed recently.

The Consequences for Drive Stats

Once we write data to a disk, that section of disk remains untouched until the data is read by the user, the data is read by the shard integrity check process to recompute the hash, or the data is deleted and written over. As a consequence, there are no updates regarding that section of disk sent to SMART stats until one of those three actions occur. By speeding up the frequency of the shard integrity checks on a disk, the disk is read more often. Errors discovered during the read operation of the shard integrity check utility are captured by the appropriate SMART attributes. Putting together the pieces, a problem that would have been discovered in the future—under our previous shard integrity check cadence—would now be captured by the SMART stats when the process reads that section of disk today.

By increasing the shard integrity check rate, we potentially moved failures that were going to be found in the future into Q3. While discovering potential problems earlier is a good thing, it is possible that the hard drive failures recorded in Q3 could then be artificially high as future failures were dragged forward into the quarter. Given that our Annualized Failure Rate calculation is based on Drive Days and Drive Failures, potentially moving up some number of failures into Q3 could cause an artificial spike in the Q3 Annualized Failure Rates. This is what we will be monitoring over the coming quarters.

There are a couple of things to note as we consider the effect of the accelerated shard integrity checks on the Q3 data for Drive Stats:

  • The number of drive failures over the lifetime of a given drive model should not increase. At best we just moved the failures around a bit.
  • It is possible that the shard integrity checks did nothing to increase the number of drive failures that occurred in Q3. The quarterly failure rates didn’t vary wildly from previous quarters, but we didn’t feel comfortable publishing them at this time given the discussion above.

Lifetime Hard Drive Stats through Q3 2019

Below are the lifetime failure rates for all of our drive models in service as of September 30, 2019.
Backblaze Lifetime Hard Drive Annualized Failure Rates
The lifetime failure rate for the drive models in production rose slightly, from 1.70% at the end of Q2 to 1.73% at the end of Q3. This trivial increase would seem to indicate that the effect of the potential Q3 data issue noted above is minimal and well within a normal variation. However, we’re not satisfied that is true yet and we have a plan for making sure as we’ll see in the next section.

What’s Next for Drive Stats?

We will continue to publish our Hard Drive Stats each quarter, and next quarter we expect to include the quarterly (Q4) chart as well. For the foreseeable future, we will have a little extra work to do internally as we will be tracking two different groups of drives. One group will be the drives that “went through the wormhole,” so to speak, as they were present during the accelerated shard integrity checks. The other group will be those drives that were placed into production after the shard integrity check setting was reduced. We’ll compare these two datasets to see if there was indeed any effect of the increased shard integrity checks on the Q3 hard drive failure rates. We’ll let you know what we find in subsequent drive stats reports.

The Hard Drive Stats Data

The complete data set used to create the information used in this review is available on our Hard Drive Test Data web page. You can download and use this data for free for your own purpose. All we ask are three things: 1) You cite Backblaze as the source if you use the data, 2) You accept that you are solely responsible for how you use the data, and, 3) You do not sell this data to anyone; it is free. Good luck and let us know what you find.

As always, we look forward to your thoughts and questions in the comments.

The post Backblaze Hard Drive Stats Q3 2019 appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

SMART Stats Exposed — a Drive Stats Remix

Post Syndicated from Patrick Thomas original https://www.backblaze.com/blog/smart-stats-exposed-a-drive-stats-remix/

SMART Stats On Trial

Editor’s Note:  Since 2013, Backblaze has published statistics and insights based on the hard drives in our data centers. Why? Well, we like to be helpful, and we thought sharing would help others who rely on hard drives but don’t have reliable data on performance to make informed purchasing decisions. We also hoped the data might aid manufacturers in improving their products. Given the millions of people who’ve read our Hard Drive Stats posts and the increasingly collaborative relationships we have with manufacturers, it seems we might have been right.

But we don’t only share our take on the numbers, we also provide the raw data underlying our reports so that anyone who wants to can reproduce them or draw their own conclusions, and many have. We love it when people reframe our reports, question our logic (maybe even our sanity?), and provide their own take on what we should do next. That’s why we’re featuring Ryan Smith today.

Ryan has held a lot of different roles in tech, but lately he’s been dwelling in the world of storage as a product strategist for Hitachi. On a personal level, he explains that he has, “passion for data, finding insights from data, and helping others see how easy and rewarding it can be to look under the covers.” It shows.

A few months ago we happened on a post by Ryan with an appealing header featuring our logo with an EXPOSED stamp superimposed in red over our humble name. It looked like we had been caught in a sting operation. As a company that loves transparency, we were delighted. Reading on we found a lot to love and plenty to argue over, but more than anything, we appreciated how Ryan took data we use to analyze hard drive failure rates and extrapolated out all sorts of other gleanings about our business. As he puts it, “it’s not the value at the surface but the story that can be told by tying data together.” So, we thought we’d share his original post with you to (hopefully) incite some more arguments and some more tying together of data.

While we think his conclusions are reasonable based on the data available to him, the views and analysis below are entirely Ryan’s. We appreciate how he flagged some areas of uncertainty, but thought it most interesting to share his thoughts without rebuttal. If you’re curious about how he reached them, you can find his notes on process here. He doesn’t have the full story, but we think he did amazing work with the public data.

Our 2019 Q3 Hard Drive Stats post will be out in a few weeks, and we hope some of you will take Ryan’s lead and do your own deep dive into the reporting when it’s public. For those of you who can’t wait, we’re hoping this will tide you over for a little while.

If you’re interested in taking a look at the data yourselves, here’s our Hard Drive Data and Stats webpage that has links to all our past Hard Drive Stats posts and zip files of the raw data.


Ryan Smith Uses Backblaze’s SMART Stats to Illustrate the Power of Data

(Originally published July 8, 2019 on Soothsawyer.com.)

It is now common practice for end-customers to share telemetry (call home) data with their vendors. My analysis below shares some insights about your business that vendors might gain from seemingly innocent data that you are sending them every day.

On a daily basis, Backblaze (a cloud backup and storage provider) logs all its drive health data (aka SMART data) for over 100,000 of its hard drives. With 100K+ records a day, each year can produce over 30 million records. They share this raw data on their website, but most people probably don’t really dig into it much. I decided to see what this data could tell me and what I found was fascinating.

Rather than looking at nearly 100 million records, I decided to only look at just over one million which consisted of the last day of every quarter from Q1’16 to Q1’19. This would give me enough granularity to see what is happening inside Backblaze’s cloud backup storage business. For those interested, I used MySQL to import and transform the data into something easy to work with (see more details on my SQL query); I then imported the data into Excel where I could easily pivot the data and look for insights. Below are the results of this effort.

Capacity Growth

User Data vs Physical Capacity

User Data Stored vs Physical Capacity

I grabbed the publicly posted “Petabytes stored” that BackBlaze claims on their website (“User Petabytes”) and compared that to the total capacity from the SMART data they log (“Physical Petabytes”) and then compared them against each other to see how much overhead or unused capacity they have. The Theoretical Max (green line) is based on their ECC protection scheme (13+2 and/or 17+3) that they use to protect user data. If the “% User Petabytes” is below that max then this means Backblaze either has unused capacity or they didn’t update their website with the actual data stored.

Data Read/Written vs Capacity Growth

Reads/Writes versus Capacity Growth (Year-over-Year)

Looking at the last two years, by quarter, you can see a healthy amount of year-over-year growth in their write workload; roughly 80% over the last four quarters! This is good since writes likely correlate with new user data, which means broader adoption of their offering. For some reason their read workloads spiked in Q2’17 and have maintained a higher read workload since then (as indicated by the YoY spikes from Q2’17 to Q1’18, and then settling back to less than 50% YoY since); my guess is this was likely driven by a change to their internal workload rather than a migration because I didn’t see subsequent negative YoY reads.

Performance

Now let’s look at some performance insights. A quick note: Only Seagate hard drives track the needed information in their SMART data in order to get insights about performance. Fortunately, roughly 80% of Backblaze’s drive population (both capacity and units) are Seagate so it’s a large enough population to represent the overall drive population. Going forward, it does look like the new 12 TB WD HGST drive is starting to track bytes read/written.

Pod (Storage Enclosure) Performance

Pod (Hard Drive Enclosure) Performance

Looking at Power-on-hours of each drive, I was able to calculate the vintage of each drive and the number of drives in each “pod” (this is the terminology that Backblaze gives to its storage enclosures). This lets me calculate the number of pods that Backblaze has in its data centers. Their original pods stored 45 drives and this improved to 60 drives in ~Q2’16 (according to past blog posts by Backblaze). The power-on-date allowed me to place the drive into the appropriate enclosure type and provide you with pod statistics like the Mbps per pod. This is definitely an educated guess as some newer vintage drives are replacement drives into older enclosures but the overall percentage of drives that fail is low enough to where these figures should be pretty accurate.

Backblaze has stated that they can achieve up to 1 Gbps per pod, but as you can see they are only reaching an average throughput of 521 Mbps. I have to admit I was surprised to see such a low performance figure since I believe their storage servers are equipped with 10 Gbps ethernet.

Overall, Backblaze’s data centers are handling over 100 GB/s of throughput across all their pods which is quite an impressive figure. This number keeps climbing and is a result of new pods as well as overall higher performance per pod. From quick research, this is across three different data centers (Sacramento x 2, Phoenix x 1) and maybe a fourth on its way in Europe.

Hard Drive Performance

Hard Drive Read/Write Performance

Since each pod holds between 45 and 60 drives, with an overall max pod performance of 1 Gbps, I wasn’t surprised to see such average low drive performance. You can see that Backblaze’s workload is read heavy with less than 1 MB/s and writes only a third of that. Just to put that in perspective, these drives can deliver over 100 MB/s, so Backblaze is not pushing the limits of these hard drives.

As discussed earlier, you can also see how the read workload changed significantly in Q2’17 and has not reverted back since.

Seagate Hard Drive Read/Write Performance, by Density

As I expected, the read and write performance is highly correlated to the drive capacity point. So, it appears that most of the growth in read/write performance per drive is really driven by the adoption of higher density drives. This is very typical of public storage-as-a-service (STaaS) offerings where it’s really about $/GB, IOPS/GB, MBs/GB, etc.

As a side note, the black dashed lines (average between all densities) should correlate with the previous chart showing overall read/write performance per drive.

Purchasing

Switching gears, let’s look at Backblaze’s purchasing history. This will help suppliers look at trends within Backblaze to predict future purchasing activities. I used power-on-hours to calculate when a drive entered the drive population.

Hard Drives Purchased by Density, by Year

Hard Drives Purchased by Capacity

This chart helps you see how Backblaze normalized on 4 TB, 8 TB, and now 12 TB densities. The number of drives that Backblaze purchases every year has been climbing until 2018 where it saw its first decline in units. However, this is mainly due to the efficiencies of the capacity per drive.

A question to ponder: Did 2018 reach a point where capacity growth per HDD surpassed the actual demand required to maintain unit growth of HDDs? Or is this trend limited to Backblaze?

Petabytes Purchased by Quarter

Drives/Petabytes Purchased, by Quarter

This looks at the number of drives purchased over the last five years, along with the amount of capacity added. It’s not quite regular enough to spot a trend, but you can quickly spot that the amount of capacity purchased over the last two years has grown dramatically compared to previous years.

HDD Vendor Market Share

Hard Drive Supplier Market Share

Western Digital/WDC, Toshiba/TOSYY, Seagate/STX

Seagate is definitely the preferred vendor, capturing almost 100% of the market share save for a few quarters where WD HGST wins 50% of the business. This information could be used by Seagate or its competitors to understand where it stands within the account for future bids. However, the industry is monopolistic so it’s not hard to guess who won the business if a given HDD vendor didn’t.

Drives

Drive Population by Quarter

Total Drive Population, by Quarter

This shows the total drive population over the past three years. Even though the number of drives being purchased has been falling lately, the overall drive population is still growing.

You can quickly see that 4 TB drives saw its peak population in Q1’17 and has rapidly declined. In fact, let’s look at the same data but with a different type of chart.

Total Drive Population, by Quarter

That’s better. We can see that 12 TBs really had a dramatic effect on both 4 TB and 8 TB adoption. In fact, Backblaze has been proactively retiring 4 TB drives. This is likely due to the desire to slow the growth of their data center footprint which comes with costs (more on this later).

As a drive vendor, I could use this data to use the 4 TB trend to calculate how much drive replacement will be occurring next quarter, along with natural PB growth. I will look more into Backblaze’s drive/pod retirement later.

Current Drive Population, by Deployed Date

Q1'2019 Drive Population, by Deployed Date

Be careful when interpreting this graph. What we are looking at here is the Q1’19 drive population where the date on the x-axis is the date the drive entered the population. This helps you see of all the drives in Backblaze’s population today, in which the oldest drives are from 2015 (with the exception of a few stragglers).

This indicates that the useful life of drives within Backblaze’s data centers are ~4 years. In fact, a later chart will look at how drives/pods are phased out, by year.

Along the top of the chart, I noted when the 60-drive pods started entering into the mix. The rack density is much more efficient with this design (rather than the 45-drive pod). Combine this, along with the 4 TB to 12 TB efficiency, Backblaze has aggressively been retiring its 4 TB/45-drive enclosures. There is still a large population of these remaining so expect some further migration to occur.

Boot Drive Population

Total Boot Drive Population, by Quarter

This is the overall boot drive population over time. You can see that it is currently dominated by the 500 GB with only a few remaining smaller densities in the population today. For some reason, Toshiba has been the preferred vendor with Seagate only recently gaining some new business.

The boot drive population is also an interesting data point to use for verifying the number of pods in the population. For example, there were 1,909 boot drives in Q1’19 and my calculation of pods based on the 45/60-drive pod mix was 1,905. I was able to use the total boot drives each quarter to double check my mix of pods.

Pods (Drive Enclosures)

As discussed earlier, pods are the drive enclosures that house all of Backblaze’s hard drives. Let’s take a look at a few more trends that show what’s going on within the walls of their data center.

Pods Population by Deployment Date

Pods (HDD Enclosure) Population by Deployment Date

This one is interesting. Each line in the graph indicates a particular snapshot in time of the total population. And the x-axis represents the vintage of the pods for that snapshot. By comparing snapshots, this allows you to see changes over time to the population. Namely, new pods being deployed and old pods being retired. To capture this, I looked at the last day of Q1 data for the last four years and calculated the date the drives entered the population. Using the “Power On Date” I was able to deduce the type of pod (45 or 60 drive) it was deployed in.

Some insights from this chart:

  • From Q2’16 to Q1’17, they retired some pods from 2010-11
  • From Q2’17 to Q1’18, they retired a significant number of pods from 2011-14
  • From Q2’18 to Q1’19, they retired pods from 2013-2015
  • Pods that were deployed since late 2015 have been untouched (you can tell this by seeing the lines overlap with each other)
  • The most pods deployed in a quarter was 185 in Q2’16
  • Since Q2’16, the number of pods deployed has been declining, on average; this is due to the increase in # of drives per pod and density of each drive
  • There are still a significant number of 45-drive pods to retire

Pods Deployed/Retired

Total Pods (HDD Enclosure) Population

Totaling up all the new pods being deployed and retired, it is easier to see the yearly changes happening within Backblaze’s operation. Keep in mind that these are all calculations and may erroneously include drive replacements as new pods; but I don’t expect it to vary significantly from what is shown here.

The data shows that any new pods that have been deployed in the past few years have mainly been driven by replacing older, less dense pods. In fact, the pod population has plateaued at around 1,900 pods.

Total Racks

Total Racks

Based on blog posts, Backblaze’s pods are all designed at 4U (4 rack units) and pictures on their site indicate 10 pods fit in a rack; this equates to 40U racks. Using this information, along with the drive population and the power-on-date, I was able to calculate the number of pods on any given date as well as the total number of racks. I did not include their networking racks in which I believe they have two of these racks per row in their data center.

You can quickly see that Backblaze has done a great job at slowing the growth of the racks in their data center. This all results in lower costs for their customers.

Retiring Pods

What interested me when looking at Backblaze’s SMART data was the fact that drives were being retired more than they were failing. This means the cost of failures is fairly insignificant in the scheme of things. It is actually efficiencies driven by technology improvements such as drive and enclosure densities that drove most of the costs. However, the benefits must outweigh the costs. Being that Backblaze uses Sungard AS for its data centers, let’s try to visualize the benefit of retiring drives/pods.

Colocation Costs, Assuming a Given Density

Yearly Colocation Costs, Assuming One Drive Density

This shows the total capacity over time in Backblaze’s data centers, along with the colocation costs assuming all the drives were a given density. As you can see, in Q1’19 it would take $7.7M a year to pay for colocating costs of 861 PB if all the drives were 4 TB in size. By moving the entire population to 12 TB this can be reduced to $2.6M. So, just changing the drive density can have significant impacts on Backblaze’s operational costs. I did assume $45/RU costs in the analysis which their costs may be as low as $15/RU based on the scale of their operation.

I threw in 32 TB densities to illustrate a hypothetical SSD-type density so you can see the colocation cost savings by moving to SSDs. Although lower, the acquisition costs are far too high at the moment to justify a move to SSDs.

Break-Even Analysis of Retiring Pods

Break-Even Analysis of Retiring Older Pods/Drives

This chart helps illustrate the math behind deciding to retire older drives/pods based on the break-even point.

Let’s break down how to read this chart:

  • This chart is looking at whether Backblaze should replace older drives with the newer 12 TB drives
  • Assuming a cost of $0.02/GB for a 12 TB drive, that is a $20/TB acquisition cost you see on the far left
  • Each line represents the cumulative cost over time (acquisition + operational costs)
  • The grey lines (4 TB and 8 TB) all assume they were already acquired so they only represent operational costs ($0 acquisition cost) since we are deciding on replacement costs
  • The operational costs (incremental yearly increase shown) is calculated off of the $45 per RU colocation cost and how many of this drive/enclosure density fits per rack unit. The more TBs you can cram into a rack unit, the lower your colocation costs are

Assuming you are still with me, this shows that the break-even point for retiring 4 TB 4U45 pods is just over two years! And 4 TB 4U60 pods at 3 years! It’s a no brainer to kill the 4 TB enclosures and replace them with 12 TB drives. Remember that this assumes a $45RU colocation cost so the break-even point will shift to the right if the colocation costs are lower (which they surely are).
You can see that the math to replace 8 TB drives with 12 TB doesn’t make as much sense so we may see Backblaze’s retirement strategy slow down dramatically after it retires the 4 TB capacity points.

As hard drive densities get larger and $/GB decreases, I expect the cumulative costs to start lower (less acquisition cost) and rise slower (less RU operational costs) making future drive retirements more attractive. Eyeballing it, it would be once $/GB approaches $0.01/GB to $0.015/GB.

Things Backblaze Should Look Into

Top of mind, Backblaze should look into these areas:

  • The architecture around performance is not balanced; investigate having a caching tier to handle bursts and put more drives behind each storage node to reduce “enclosure/slot tax” costs.
  • Look into designs like 5U84 from Seagate/Xyratex providing 16.8 drives per RU versus the 15 being achieved on Backblaze’s own 4U60 design; Another 12% efficiency!
    • 5U allows for 8 pods to fit per rack versus the 10.
  • Look at when SSDs will be attractive to replace HDDs at a given $/GB, density, idle costs, # of drives that fit per RU (using 2.5” drives instead of 3.5”) so that they can stay on top of this trend [there is no rush on this one].
    • Performance and endurance of SSDs is irrelevant since the performance requirements are so low and the WPD is almost non-existence, making QLC and beyond a great candidate.
  • Look at allowing pods to be more flexible in handling different capacity drives to handle drive failures more cost efficiently without having to retire pods. Having concepts of “virtual pods” that don’t have physical limits will better accommodate the future that Backblaze has where it won’t be retiring pods as aggressively, yet still let them grow their pod densities seamlessly.

In Closing

It is kind of ironic that the reason Backblaze posted all their SMART data is to share insights around failures when I didn’t even analyze failures once! There is much more analysis that could be done around this data set which I may revisit as time permits.

As you can see, even simple health data from drives, along with a little help from other data sources, can help expose a lot more than you would initially think. I have long felt that people have yet to understand the full power of giving data freely to businesses (e.g. Facebook, Google Maps, LinkedIn, Mint, Personal Capital, News Feeds, Amazon). I often hear things like, “I have nothing to hide,” which indicates the lack of value they assign to their data. It’s not the value at its surface but the story that can be told by tying data together.

Until next time, Ryan Smith.

•   •   •

Ryan Smith is currently a product strategist at Hitachi Vantara. Previously, he served as the director of NAND product marketing at Samsung Semiconductor, Inc. He is extremely passionate about uncovering insights from just about any data set. He just likes to have fun by making a notable difference, influencing others, and working with smart people.
 

Tell us what you think about Ryan’s take on data, or better yet, give us your own! You can find all the data you would ever need on Backblaze’s Hard Drive Data and Stats webpage. Share your thoughts in the comments below or email us at mailbag@backblaze.com.

The post SMART Stats Exposed — a Drive Stats Remix appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

The Life and Times of a Backblaze Hard Drive

Post Syndicated from Andy Klein original https://www.backblaze.com/blog/life-and-times-of-a-backblaze-hard-drive/

Seagate 12 TB hard drive

Backblaze likes to talk about hard drive failures — a lot. What we haven’t talked much about is how we deal with those failures: the daily dance of temp drives, replacement drives, and all the clones that it takes to keep over 100,000 drives healthy. Let’s go behind the scenes and take a look at that dance from the eyes of one Backblaze hard drive.

After sitting still for what seemed like forever, ZCH007BZ was on the move. ZCH007BZ, let’s call him Zach, is a Seagate 12 TB hard drive. For the last few weeks, Zach and over 6,000 friends were securely sealed inside their protective cases in the ready storage area of a Backblaze data center. Being a hard disk drive, Zach’s modest dream was to be installed in a system, spin merrily, and store data for many years to come. And now the wait was nearly over, or was it?

Hard drives in wrappers

The Life of Zach

Zach was born in a factory in Singapore and shipped to the US, eventually finding his way to Backblaze, but he didn’t know that. He had sat sealed in the dark for weeks. Now Zach and boxes of other drives were removed from their protective cases and gently stacked on a cart. Zack was near the bottom of the pile, but even he could see endless columns of beautiful red boxes stacked seemingly to the sky. “Backblaze!” one of the drives on the cart whispered. All the other drives gasped with recognition. Thank goodness the noise-cancelling headphones worn by all Backblaze Data Center Techs covered the drives’ collective excitement.

While sitting in the dark, the drives had gossiped about where they were: a data center, a distribution warehouse, a Costco, or Best Buy. Backblaze came up a few times, but that was squashed — they couldn’t be that lucky. After all, Backblaze was the only place where a drive could be famous. Before Backblaze, hard drives labored in anonymity. Occasionally, one or two would be seen in a hard drive tear down article, but even that sort of exposure had died out a couple of years ago. But Backblaze publishes everything about their drives, their model numbers, their serial numbers, heck even their S.M.A.R.T. statistics. There was a rumor that hard drives worked extra hard at Backblaze because they knew they would be in the public eye. With red Backblaze Storage Pods as far as the eye could see, Zach and friends were about to find out.

Drive with guideThe cart Zach and his friends were on glided to a stop at the production build facility. This is where storage pods are filled with drives and tested before being deployed. The cart stopped by the first of twenty V6.0 Backblaze Storage Pods that together would form a Backblaze Vault. At each Storage Pod station 60 drives were unloaded from the cart. The serial number of each drive was recorded along with the Storage Pod ID and drive location in the pod. Finally, each drive was fitted with a pair of drive guides and slid into its new home as a production drive in a Backblaze Storage Pod. “Spin long and prosper,” Zach said quietly each time the lid of a Storage Pod snapped in place covering the 60 giddy hard drives inside. The process was repeated for the remaining 19 Storage Pods, and when it was done Zach remained on the cart. He would not be installed in a production system today.

The Clone Room

Zach and the remaining drives on the cart were slowly wheeled down the hall. Bewildered, they were rolled in the clone room. “What’s a clone room,” Zach asked to himself? The drives on the cart were divided into two groups, with one group being placed on the clone table, and the other being placed on the test table. Zach was on the test table.

Almost as soon as Zach was placed on the test table, the DC Tech picked him up again and placed him and several other drives into a machine. He was about to get formatted. The entire formatting process only took a few minutes for Zach, as it did for all of the other drives on the test table. Zach counted 25 drives, including himself.

Still confused and a little sore from the formatting, Zach and two other drives were picked up from the bench by a different DC Tech. She recorded their vitals — serial number, manufacturer, and model — and left the clone room with all three drives on a different cart.

Dreams of a Test Drive

Luigi, Storage Pod liftThe three drives were back on the data center floor with red Storage Pods all around. The DC Tech had maneuvered Luigi, the local Storage Pod lift unit, to hold a Storage Pod she was sliding from a data center rack. The lid was opened, the tech attached a grounding clip, and then removed one of the drives in the Storage Pod. She recorded the vitals of the removed drive. While she was doing so, Zach could hear the removed drive breathlessly mumble something about media errors, but before Zach could respond, the tech picked him up, attached drive guides to his frame and gently slide him into the Storage Pod. The tech updated her records, closed the lid, and slide the pod back into place. A few seconds later, Zach felt a jolt of electricity pass through his circuits and he and 59 other drives spun to life. Zach was now part of a production Backblaze Storage Pod.

First, Zach was introduced to the other 19 members of his tome. There are 20 drives in a tome, with each living in a separate Storage Pod. Files are divided (sharded) across these 20 drives using Backblaze’s open-sourced erasure code algorithm.

Zach’s first task was to rebuild all of the files that were stored on the drive he replaced. He’d do this by asking for pieces (shards) of all the files from the 19 other drives in his tome. He only needed 17 of the pieces to rebuild a file, but he asked everyone in case there was a problem. Rebuilding was hard work, and the other drives were often busy with reading files, performing shard integrity checks, and so on. Depending on how busy the system was, and how full the drives were, it might take Zach a couple of weeks to rebuild the files and get him up to speed with his contemporaries.

Nightmares of a Test Drive

Little did he know, but at this point, Zach was still considered a temp replacement drive. The dysfunctional drive that he replaced was making its way back to the clone room where a pair of cloning units, named Harold and Maude in this case, waited. The tech would attempt to clone the contents of the failed drive to a new drive assigned to the clone table. The primary reason for trying to clone a failed drive was recovery speed. A drive can be cloned in a couple of days, but as noted above, it can take up to a couple of weeks to rebuild a drive, especially large drives on busy systems. In short, a successful clone would speed up the recovery process.

For nearly two days straight, Zach was rebuilding. He barely had time to meet his pod neighbors, Cheryl and Carlos. Since they were not rebuilding, they had plenty of time to marvel at how hard Zach was working. He was 25 % done and going strong when the Storage Pod powered down. Moments later, the pod was slid out of the rack and the lid popped open. Zach assumed that another drive in the pod had failed, when he felt the spindly, cold fingers of the tech grab him and yank firmly. He was being replaced.

Storage Pod in Backblaze data center

Zach had done nothing wrong. It was just that the clone was successful, with nearly all the files being copied from the previous drive to the smiling clone drive that was putting on Zach’s drive guides and gently being inserted in Zach’s old slot. “Goodbye,” he managed to eek out as he was placed on the cart and watched the tech bring the Storage Pod back to life. Confused, angry, and mostly exhausted, Zach quickly fell asleep.

Zach woke up just in time to see he was in the formatting machine again. The data he had worked so hard to rebuild was being ripped from his platters and replaced randomly with ones and zeroes. This happened multiple times and just as Zach was ready to scream, it stopped, and he was removed from his torture and stacked neatly with a few other drives.

After a while he looked around, and once the lights went out the stories started. Zach wasn’t alone. Several of the other temp drives had pretty much the same story; they thought they had found a home, only to be replaced by some uppity clone drive. One of the temp drives, Lin, said she had been in three different systems only to be replaced each time by a clone drive. No one wanted to believe her, but no one knew what was next either.

The Day the Clone Died

Zach found out the truth a few days later when he was selected, inspected, and injected as a temp drive into another Storage Pod. Then three days later he was removed, wiped, reformatted, and placed back in the temp pool. He began to resign himself to life as a temp drive. Not exactly glamorous, but he did get his serial number in the Backblaze Drive Stats data tables while he was a temp. That was more than the millions of other drives in the world that would forever be unknown.

On his third temp drive stint, he was barely in the pod a day when the lid opened and he was unceremoniously removed. This was the life of temp drive, and when the lid opened on the fourth day of his fourth temp drive shift, he just closed his eyes and waited for his dream to end again. Except, this time, the tech’s hand reached past him and grabbed a drive a few slots away. That unfortunate drive had passed the night before, a full-fledged crash. Zach, like all the other drives nearby, had heard the screams.

Another temp drive Zach knew from the temp table replaced the dead drive, then the lid was closed, the pod slid back into place, and power was restored. With that Zach, doubled down on getting rebuilt — maybe if he could get done before the clone was finished then he could stay. What Zach didn’t know was that the clone process for the drive he had replaced had failed. This happens about half the time. Zach was home free; he just didn’t know it.

In a couple of days, Zach was finished rebuilding and become a real member of a production Backblaze Storage Pod. He now spends his days storing and retrieving data, getting his bits tested by shard integrity checks, and having his S.M.A.R.T. stats logged for the Backblaze Drive Stats. His hard drive life is better than he ever dreamed.

The post The Life and Times of a Backblaze Hard Drive appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Petabytes on a Budget: 10 Years and Counting

Post Syndicated from Andy Klein original https://www.backblaze.com/blog/petabytes-on-a-budget-10-years-and-counting/

A Decade of the Pod

This post is for all of the storage geeks out there who have followed the adventures of Backblaze and our Storage Pods over the years. The rest of you are welcome to come along for the ride.

It has been 10 years since Backblaze introduced our Storage Pod to the world. In September 2009, we announced our hulking, eye-catching, red 4U storage server equipped with 45 hard drives delivering 67 terabytes of storage for just $7,867 — that was about $0.11 a gigabyte. As part of that announcement, we open-sourced the design for what we dubbed Storage Pods, telling you and everyone like you how to build one, and many of you did.

Backblaze Storage Pod version 1 was announced on our blog with little fanfare. We thought it would be interesting to a handful of folks — readers like you. In fact, it wasn’t even called version 1, as no one had ever considered there would be a version 2, much less a version 3, 4, 4.5, 5, or 6. We were wrong. The Backblaze Storage Pod struck a chord with many IT and storage folks who were offended by having to pay a king’s ransom for a high density storage system. “I can build that for a tenth of the price,” you could almost hear them muttering to themselves. Mutter or not, we thought the same thing, and version 1 was born.

The Podfather

Tim, the “Podfather” as we know him, was the Backblaze lead in creating the first Storage Pod. He had design help from our friends at Protocase, who built the first three generations of Storage Pods for Backblaze and also spun out a company named 45 Drives to sell their own versions of the Storage Pod — that’s open source at its best. Before we decided on the version 1 design, there were a few experiments along the way:

Wooden pod
Octopod

The original Storage Pod was prototyped by building a wooden pod or two. We needed to test the software while the first metal pods were being constructed.

The Octopod was a quick and dirty response to receiving the wrong SATA cables — ones that were too long and glowed. Yes, there are holes drilled in the bottom of the pod.

Pre-1 Storage Pod
Early not-red Storage Pod

The original faceplate shown above was used on about 10 pre-1.0 Storage Pods. It was updated to the three circle design just prior to Storage Pod 1.0.

Why are Storage Pods red? When we had the first ones built, the manufacturer had a batch of red paint left over that could be used on our pods, and it was free.

Back in 2007, when we started Backblaze, there wasn’t a whole lot of affordable choices for storing large quantities of data. Our goal was to charge $5/month for unlimited data storage for one computer. We decided to build our own storage servers when it became apparent that, if we were to use the other solutions available, we’d have to charge a whole lot more money. Storage Pod 1.0 allowed us to store one petabyte of data for about $81,000. Today we’ve lowered that to about $35,000 with Storage Pod 6.0. When you take into account that the average amount of data per user has nearly tripled in that same time period and our price is now $6/month for unlimited storage, the math works out about the same today as it did in 2009.

We Must Have Done Something Right

The Backblaze Storage Pod was more than just affordable data storage. Version 1.0 introduced or popularized three fundamental changes to storage design: 1) You could build a system out of commodity parts and it would work, 2) You could mount hard drives vertically and they would still spin, and 3) You could use consumer hard drives in the system. It’s hard to determine which of these three features offended and/or excited more people. It is fair to say that ten years out, things worked out in our favor, as we currently have about 900 petabytes of storage in production on the platform.

Over the last 10 years, people have warmed up to our design, or at least elements of the design. Starting with 45 Drives, multitudes of companies have worked on and introduced various designs for high density storage systems ranging from 45 to 102 drives in a 4U chassis, so today the list of high-density storage systems that use vertically mounted drives is pretty impressive:

CompanyServerDrive Count
45 DrivesStorinator S4545
45 DrivesStorinator XL6060
ChenbroRM4316060
ChenbroRM43699100
DellDSS 700090
HPECloudline CL520080
HPECloudline CL5800100
NetGear ReadyNAS 4360X60
NewisysNDS 445060
QuantaQuantaGrid D51PL-4U102
QuantaQuantaPlex T21P-4U70
Seagate Exos AP 4U10096
SupermicroSuperStorage 6049P-E1CR60L60
SupermicroSuperStorage 6049P-E1CR45L45
TyanThunder SX FA100-B7118100
Viking Enterprise SolutionsNSS-460260
Viking Enterprise SolutionsNDS-490090
Viking Enterprise SolutionsNSS-41000100
Western DigitalUltrastar Serv60+860
WiwynnSV7000G272

Another driver in the development of some of these systems is the Open Compute Project (OCP). Formed in 2011, they gather and share ideas and designs for data storage, rack designs, and related technologies. The group is managed by The Open Compute Project Foundation as a 501(c)(6) and counts many industry luminaries in the storage business as members.

What Have We Done Lately?

In technology land, 10 years of anything is a long time. What was exciting then is expected now. And the same thing has happened to our beloved Storage Pod. We have introduced updates and upgrades over the years twisting the usual dials: cost down, speed up, capacity up, vibration down, and so on. All good things. But, we can’t fool you, especially if you’ve read this far. You know that Storage Pod 6.0 was introduced in April 2016 and quite frankly it’s been crickets ever since as it relates to Storage Pods. Three plus years of non-innovation. Why?

  1. If it ain’t broke, don’t fix it. Storage Pod 6.0 is built in the US by Equus Compute Solutions, our contract manufacturer, and it works great. Production costs are well understood, performance is fine, and the new higher density drives perform quite well in the 6.0 chassis.
  2. Disk migrations kept us busy. From Q2 2016 through Q2 2019 we migrated over 53,000 drives. We replaced 2, 3, and 4 terabyte drives with 8, 10, and 12 terabyte drives, doubling, tripling and sometimes quadrupling the storage density of a storage pod.
  3. Pod upgrades kept us busy. From Q2 2016 through Q1 2019, we upgraded our older V2, V3, and V4.5 storage pods to V6.0. Then we crushed a few of the older ones with a MegaBot and gave a bunch more away. Today there are no longer any stand-alone storage pods; they are all members of a Backblaze Vault.
  4. Lots of data kept us busy. In Q2 2016, we had 250 petabytes of data storage in production. Today, we have 900 petabytes. That’s a lot of data you folks gave us (thank you by the way) and a lot of new systems to deploy. The chart below shows the challenge our data center techs faced.

Petabytes Stored vs Headcount vs Millions Raised

In other words, our data center folks were really, really busy, and not interested in shiny new things. Now that we’ve hired a bunch more DC techs, let’s talk about what’s next.

Storage Pod Version 7.0 — Almost

Yes, there is a Backblaze Storage Pod 7.0 on the drawing board. Here is a short list of some of the features we are looking at:

  • Updating the motherboard
  • Upgrade the CPU and consider using an AMD CPU
  • Updating the power supply units, perhaps moving to one unit
  • Upgrading from 10Gbase-T to 10GbE SFP+ optical networking
  • Upgrading the SATA cards
  • Modifying the tool-less lid design

The timeframe is still being decided, but early 2020 is a good time to ask us about it.

“That’s nice,” you say out loud, but what you are really thinking is, “Is that it? Where’s the Backblaze in all this?” And that’s where you come in.

The Next Generation Backblaze Storage Pod

We are not out of ideas, but one of the things that we realized over the years is that many of you are really clever. From the moment we open sourced the Storage Pod design back in 2009, we’ve received countless interesting, well thought out, and occasionally odd ideas to improve the design. As we look to the future, we’d be stupid not to ask for your thoughts. Besides, you’ll tell us anyway on Reddit or HackerNews or wherever you’re reading this post, so let’s just cut to the chase.

Build or Buy

The two basic choices are: We design and build our own storage servers or we buy them from someone else. Here are some of the criteria as we think about this:

  1. Cost: We’d like the cost of a storage server to be about $0.030 – $0.035 per gigabyte of storage (or less of course). That includes the server and the drives inside. For example, using off-the-shelf Seagate 12 TB drives (model: ST12000NM0007) in a 6.0 Storage Pod costs about $0.032-$0.034/gigabyte depending on the price of the drives on a given day.
  2. International: Now that we have a data center in Amsterdam, we need to be able to ship these servers anywhere.
  3. Maintenance: Things should be easy to fix or replace — especially the drives.
  4. Commodity Parts: Wherever possible, the parts should be easy to purchase, ideally from multiple vendors.
  5. Racks: We’d prefer to keep using 42” deep cabinets, but make a good case for something deeper and we’ll consider it.
  6. Possible Today: No DNA drives or other wistful technologies. We need to store data today, not in the year 2061.
  7. Scale: Nothing in the solution should limit the ability to scale the systems. For example, we should be able to upgrade drives to higher densities over the next 5-7 years.

Other than that there are no limitations. Any of the following acronyms, words, and phrases could be part of your proposed solution and we won’t be offended: SAS, JBOD, IOPS, SSD, redundancy, compute node, 2U chassis, 3U chassis, horizontal mounted drives, direct wire, caching layers, appliance, edge storage units, PCIe, fibre channel, SDS, etc.

The solution does not have to be a Backblaze one. As the list from earlier in this post shows, Dell, HP, and many others make high density storage platforms we could leverage. Make a good case for any of those units, or any others you like, and we’ll take a look.

What Will We Do With All Your Input?

We’ve already started by cranking up Backblaze Labs again and have tried a few experiments. Over the coming months we’ll share with you what’s happening as we move this project forward. Maybe we’ll introduce Storage Pod X or perhaps take some of those Storage Pod knockoffs for a spin. Regardless, we’ll keep you posted. Thanks in advance for your ideas and thanks for all your support over the past ten years.

The post Petabytes on a Budget: 10 Years and Counting appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

A Toast to Our Partners in Europe at IBC

Post Syndicated from Janet Lafleur original https://www.backblaze.com/blog/a-toast-to-our-partners-in-europe-at-ibc/

Join us at IBC

Prost! Skål! Cheers! Celebrate with us as we travel to Amsterdam for IBC, the premier conference and expo for media and entertainment technology in Europe. The show gives us a chance to raise a glass with our partners, customers, and future customers across the pond. And we’re especially pleased that IBC coincides with the opening of our new European data center.

How will we celebrate? With the Backblaze Partner Crawl, a rolling series of parties on the show floor from 13-16 September. Four of our Europe-based integration partners have graciously invited us to co-host drinks and bites in their stands throughout the show.

If you can make the trip to IBC, you’re invited to toast us with a skål! with our Swedish friends at Cantemo on Friday, a prost! with our German friends at Archiware on Saturday, or a cheers! with UK-based friends at Ortana and GB Labs on Sunday or Monday, respectively. Or drop in every day and keep the Backblaze Partner Crawl rolling. And if you can’t make it to IBC this time, we encourage you to raise a glass and toast anyway.

Skål! on Friday With Cantemo

Cantemo’s iconik media management makes sharing and collaborating on media effortless, regardless of wherever you want to do business. Cantemo announced the integration of iconik with Backblaze’s B2 Cloud Storage last fall, and since then we’ve been amazed by customers like Everwell, who replaced all their on-premises storage with a fully cloud-based production workflow. For existing Backblaze customers, iconik can speed up your deployment by ingesting content already uploaded to B2 without having to download files and upload them again.
You can also stop by the Cantemo booth anytime during IBC to see a live demo of iconik and Backblaze in action. Or schedule an appointment and we’ll have a special gift waiting for you.

Join us at Cantemo on Friday 13 September from 16:30-18:00 at Hall 7 — 7.D67

Prost! on Saturday With Archiware

With the latest release of their P5 Archive featuring B2 support, Archiware makes archiving to the cloud even easier. Archiware customers with large existing archives can use the Backblaze Fireball to rapidly import archived content directly to their B2 account. At IBC, we’re also unveiling our latest joint customer, Baron & Baron, a creative agency that turned to P2 and B2 to back up and archive their dazzling array of fashion and luxury brand content.

Join us at Archiware on Saturday 14 September from 16:30-18:00 at Hall 7 — 7.D35

Cheers! on Sunday With Ortana

Ortana integrated their Cubix media asset management and orchestration platform with B2 way back in 2016 during B2’s beta period, making them among our first media workflow partners. More recently, Ortana joined our Migrate or Die webinar and blog series, detailing strategies for how you can migrate archived content from legacy platforms before they go extinct.

Join us at Ortana on Sunday 15 September from 16:30-18:00 at Hall 7 — 7.C63

Cheers! on Monday With GB Labs

If you were at the NAB Show last April, you may have heard GB Labs was integrating their automation tools with B2. It’s official now, as detailed in their announcement in June. GB Labs’ automation allows you to streamline tasks that would otherwise require tedious and repetitive manual processes, and now supports moving files to and from your B2 account.

Join us at GB Labs Monday 16 September from 17:00-18:00 at Hall 7 — 7.B26

Say Hello Anytime to Our Friends at CatDV

CatDV media asset management helps teams organize, communicate, and collaborate effectively, including archiving content to B2. CatDV has been integrated with B2 for over two years, allowing us to serve customers like UC Silicon Valley, who built an end-to-end collaborative workflow for a 22 member team creating online learning videos.

Stop by CatDV anytime at Hall 7 — 7.A51

But we’re not the only ones making a long trek to Amsterdam for IBC. While you’re roaming around Hall 7, be sure to stop by our other partners traveling from near and far to learn what our joint solutions can do for you:

  • EditShare (shared storage with MAM) Hall 7 — 7.A35
  • ProMax (shared storage with MAM) Hall 7 — 7.D55
  • StorageDNA (smart migration and storage) Hall 7 — 7.A32
  • FileCatalyst (large file transfer) Hall 7 — 7.D18
  • eMAM (web-based DAM) Hall 7 — 7.D27
  • Facilis Technology (shared storage) Hall 7 — 7.B48
  • GrayMeta (metadata extraction and insight) Hall 7 — 7.D25
  • Hedge (backup software) Hall 7 — 7.A56
  • axle ai (asset management) Hall 7 — 7.D33
  • Tiger Technology (tiered data management) Hall 7 — 7.B58

We’re hoping you’ll join us for one or more of our Partner Crawl parties. If you want a quieter place and time to discuss how B2 can streamline your workflow, please schedule an appointment with us so we can give you the attention you need.

Finally, if you can’t join us in Amsterdam, open a beer, pour a glass of wine or other drink, and toast to our new European data center, wherever you are, in whatever language you speak. As we say here in the States, Bottoms up!

The post A Toast to Our Partners in Europe at IBC appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Announcing Our First European Data Center

Post Syndicated from Ahin Thomas original https://www.backblaze.com/blog/announcing-our-first-european-data-center/

city view of Amsterdam, Netherlands

Big news: Our first European data center, in Amsterdam, is open and accepting customer data!

This is our fourth data center (DC) location and the first outside of the western United States. As longtime readers know, we have two DCs in the Sacramento, California area and one in the Phoenix, Arizona area. As part of this launch, we are also introducing the concept of regions.

When creating a Backblaze account, customers can choose whether that account’s data will be stored in the EU Central or US West region. The choice made at account creation time will dictate where all of that account’s data is stored, regardless of product choice (Computer Backup or B2 Cloud Storage). For customers wanting to store data in multiple regions, please read this knowledge base article on how to control multiple Backblaze accounts using our (free) Groups feature.

Whether you choose EU Central or US West, your pricing for our products will be unchanged:

  • For B2 Cloud Storage — it’s $0.005/GB/Month. For comparison, storing your data in Amazon S3’s Ireland region will cost ~4.5x more
  • For Computer Backup — $60/Year/Computer is the monthly cost of our industry leading, unlimited data backup for desktops/laptops

Later this week we will be publishing more details on the process we undertook to get to this launch. Here’s a sneak preview:

  • Wednesday, August 28: Getting Ready to Go (to Europe). How do you even begin to think about opening a DC that isn’t within any definition of driving distance? For the vast majority of companies on the planet, simply figuring out how to get started is a massive undertaking. We’ll be sharing a little more on how we thought about our requirements, gathered information, and the importance of NATO in the whole equation.
  • Thursday, August 29: The Great European (Non) Vacation. With all the requirements done, research gathered, and preliminary negotiations held, there comes a time when you need to jump on a plane and go meet your potential partners. For John & Chris, that meant 10 data center tours in 72 hours across three countries — not exactly a relaxing summer holiday, but vitally important!
  • Friday, August 30: Making a Decision. After an extensive search, we are very pleased to have found our partner in Interxion! We’ll share a little more about the process of narrowing down the final group of candidates and selecting our newest partner.
If you’re interested in learning more about the physical process of opening up a data center, check out our post on the seven days prior to opening our Phoenix DC.

New Data Center FAQs:

Q: Does the new DC mean Backblaze has multi-region storage?
A: Yes, by leveraging our Groups functionality. When creating an account, users choose where their data will be stored. The default option will store data in US West, but to choose EU Central, simply select that option in the pull-down menu.

Region selector
Choose EU Central for data storage

If you create a new account with EU Central selected and have an existing account that’s in US West, you can put both of them in a Group, and manage them from there! Learn more about that in our Knowledge Base article.

Q: I’m an existing customer and want to move my data to Europe. How do I do that?
A: At this time, we do not support moving existing data within Backblaze regions. While it is something on our roadmap to support, we do not have an estimated release date for that functionality. However, any customer can create a new account and upload data to Europe. Customers with multiple accounts can administer those accounts via our Groups feature. For more details on how to do that, please see this Knowledge Base article. Existing customers can create a new account in the EU Central region and then upload data to it; they can then either keep or delete the previous Backblaze account in US West.

Q: Finally! I’ve been waiting for this and am ready to get started. Can I use your rapid ingest device, the B2 Fireball?
A: Yes! However, as of the publication of this post, all Fireballs will ship back to one of our U.S. facilities for secure upload (regardless of account location). By the end of the year, we hope to offer Fireball support natively in Europe (so a Fireball with a European customer’s data will never leave the EU).

Q: Does this mean that my data will never leave the EU?
A: Any data uploaded by the customer does not leave the region it was uploaded to unless at the explicit direction of the customer. For example, restores and snapshots of data stored in Europe can be downloaded directly from Europe. However, customers requesting an encrypted hard drive with their data on it will have that drive prepared from a secure U.S. location. In addition, certain metadata about customer accounts (e.g. email address for your account) reside in the U.S. For more information on our privacy practices, please read our Privacy Policy.

Q: What are my payment options?
A: All payments to Backblaze are made in U.S. dollars. To get started, you can enter your credit card within your account.

Q: What’s next?
A: We’re actively working on region selection for individual B2 Buckets (instead of Backblaze region selection on an account basis), which should open up a lot more interesting workflows! For example, customers who want can create geographic redundancy for data within one B2 account (and for those who don’t want to set that up, they can sleep well knowing they have 11 nines of durability).

We like to develop the features and functionality that our customers want. The decision to open up a data center in Europe is directly related to customer interest. If you have requests or questions, please feel free to put them in the comment section below.

The post Announcing Our First European Data Center appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Backblaze Hard Drive Stats Q2 2019

Post Syndicated from Andy Klein original https://www.backblaze.com/blog/hard-drive-stats-q2-2019/

Backblaze Drive Stats Q2 2019
< ve models that have been around for several years, take a look at how our 14 TB Toshiba drives are doing (spoiler alert: great), and along the way we’ll provide a handful of insights and observations from inside our storage cloud. As always, we’ll publish the data we use in these reports on our Hard Drive Test Data web page and we look forward to your comments.

Hard Drive Failure Stats for Q2 2019

At the end of Q2 2019, Backblaze was using 108,660 hard drives to store data. For our evaluation we remove from consideration those drives that were used for testing purposes and those drive models for which we did not have at least 60 drives (see why below). This leaves us with 108,461 hard drives. The table below covers what happened in Q2 2019.

Backblaze Q2 2019 Hard Drive Failure Rates

Notes and Observations

If a drive model has a failure rate of 0 percent, it means there were no drive failures of that model during Q2 2019 — lifetime failure rates are later in this report. The two drives listed with zero failures in Q2 were the 4 TB and 14 TB Toshiba models. The Toshiba 4 TB drive doesn’t have a large enough number of drives or drive days to be statistically reliable, but only one drive of that model has failed in the last three years. We’ll dig into the 14 TB Toshiba drive stats a little later in the report.

There were 199 drives (108,660 minus 108,461) that were not included in the list above because they were used as testing drives or we did not have at least 60 of a given drive model. We now use 60 drives of the same model as the minimum number when we report quarterly, yearly, and lifetime drive statistics as there are 60 drives in all newly deployed Storage Pods — older Storage Pod models had a minimum of 45.

2,000 Backblaze Storage Pods? Almost…

We currently have 1,980 Storage Pods in operation. All are version 5 or version 6 as we recently gave away nearly all of the older Storage Pods to folks who stopped by our Sacramento storage facility. Nearly all, as we have a couple in our Storage Pod museum. There are currently 544 version 5 pods each containing 45 data drives, and there are 1436 version 6 pods each containing 60 data drives. The next time we add a Backblaze Vault, which consists of 20 Storage Pods, we will have 2,000 Backblaze Storage Pods in operation.

Goodbye Western Digital

In Q2 2019, the last of the Western Digital 6 TB drives were retired from service. The average age of the drives was 50 months. These were the last of our Western Digital branded data drives. When Backblaze was first starting out, the first data drives we deployed en masse were Western Digital Green 1 TB drives. So, it is with a bit of sadness to see our Western Digital data drive count go to zero. We hope to see them again in the future.

WD Ultrastar 14 TB DC HC530

Hello “Western Digital”

While the Western Digital brand is gone, the HGST brand (owned by Western Digital) is going strong as we still have plenty of the HGST branded drives, about 20 percent of our farm, ranging in size from 4 to 12 TB. In fact, we added over 4,700 HGST 12 TB drives in this quarter.

This just in; rumor has it there are twenty 14 TB Western Digital Ultrastar drives getting readied for deployment and testing in one of our data centers. It appears Western Digital has returned: stay tuned.

Goodbye 5 TB Drives

Back in Q1 2015, we deployed 45 Toshiba 5 TB drives. They were the only 5 TB drives we deployed as the manufacturers quickly moved on to larger capacity drives, and so did we. Yet, during their four plus years of deployment only two failed, with no failures since Q2 of 2016 — three years ago. This made it hard to say goodbye, but buying, stocking, and keeping track of a couple of 5 TB spare drives was not optimal, especially since these spares could not be used anywhere else. So yes, the Toshiba 5 TB drives were the odd ducks on our farm, but they were so good they got to stay for over four years.

Hello Again Toshiba 14 TB Toshiba Drives

We’ve mentioned the Toshiba 14 TB drives in previous reports, now we can dig in a little deeper given that they have been deployed almost nine months and we have some experience working with them. These drives got off to a bit of a rocky start, with six failures in the first three months of being deployed. Since then, there has been only one additional failure, with no failures reported in Q2 2019. The result is that the lifetime annualized failure rate for the Toshiba 14 TB drives has decreased to a very respectable 0.78% as shown in the lifetime table in the following section.

Lifetime Hard Drive Stats

The table below shows the lifetime failure rates for the hard drive models we had in service as of June 30, 2019. This is over the period beginning in April 2013 and ending June 30, 2019.

Backblaze Lifetime Hard Drive Annualized Failure Rates

The Hard Drive Stats Data

The complete data set used to create the information used in this review is available on our Hard Drive Test Data web page. You can download and use this data for free for your own purpose. All we ask are three things: 1) You cite Backblaze as the source if you use the data, 2) You accept that you are solely responsible for how you use the data, and, 3) You do not sell this data to anyone; it is free. Good luck and let us know if you find anything interesting.

If you just want the tables we used to create the charts in this blog post you can download the ZIP file containing the MS Excel spreadsheet.

The post Backblaze Hard Drive Stats Q2 2019 appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

The Shocking Truth — Managing for Hard Drive Failure and Data Corruption

Post Syndicated from Skip Levens original https://www.backblaze.com/blog/managing-for-hard-drive-failures-data-corruption/

hard disk drive covered in 0s, 1s, ?s

Ah, the iconic 3.5″ hard drive, now approaching a massive 16TB of storage capacity. Backblaze storage pods fit 60 of these drives in a single pod, and with well over 750 petabytes of customer data under management in our data centers, we have a lot of hard drives under management.

Yet most of us have just one, or only a few of these massive drives at a time storing our most valuable data. Just how safe are those hard drives in your office or studio? Have you ever thought about all the awful, terrible things that can happen to a hard drive? And what are they, exactly?

It turns out there are a host of obvious physical dangers, but also other, less obvious, errors that can affect the data stored on your hard drives, as well.

Dividing by One

It’s tempting to store all of your content on a single hard drive. After all, the capacity of these drives gets larger and larger, and they offer great performance of up to 150 MB/s. It’s true that flash-based hard drives are far faster, but the dollars per gigabyte price is also higher, so for now, the traditional 3.5″ hard drive holds most data today.

However, having all of your precious content on a single, spinning hard drive is a true tightrope without a net experience. Here’s why.

Drivesaver Failure Analysis by the Numbers

Drive failures by possible external force

I asked our friends at Drivesavers, specialists in recovering data from drives and other storage devices, for some analysis of the hard drives brought into their labs for recovery. What were the primary causes of failure?

Reason One: Media Damage

The number one reason, accounting for 70 percent of failures, is media damage, including full head crashes.

Modern hard drives stuff multiple, ultra thin platters inside that 3.5 inch metal package. These platters spin furiously at 5400 or 7200 revolutions per minute — that’s 90 or 120 revolutions per second! The heads that read and write magnetic data on them sweep back and forth only 6.3 micrometers above the surface of those platters. That gap is about 1/12th the width of a human hair and a miracle of modern technology to be sure. As you can imagine, a system with such close tolerances is vulnerable to sudden shock, as evidenced by Drivesavers’ results.

This damage occurs when the platters receive shock, i.e. physical damage from impact to the drive itself. Platters have been known to shatter, or have damage to their surfaces, including a phenomenon called head crash, where the flying heads slam into the surface of the platters. Whatever the cause, the thin platters holding 1s and 0s can’t be read.

It takes a surprisingly small amount of force to generate a lot of shock energy to a hard drive. I’ve seen drives fail after simply tipping over when stood on end. More typically, drives are accidentally pushed off of a desktop, or dropped while being carried around.

A drive might look fine after a drop, but the damage may have been done. Due to their rigid construction, heavy weight, and how often they’re dropped on hard, unforgiving surfaces, these drops can easily generate the equivalent of hundreds of g-forces to the delicate internals of a hard drive.

To paraphrase an old (and morbid) parachutist joke, it’s not the fall that gets you, it’s the sudden stop!

Reason Two: PCB Failure

The next largest cause is circuit board failure, accounting for 18 percent of failed drives. Printed circuit boards (PCBs), those tiny green boards seen on the underside of hard drives, can fail in the presence of moisture or static electric discharge like any other circuit board.

Reason Three: Stiction

Next up is stiction (a portmanteau of friction and sticking), which occurs when the armatures that drive those flying heads actually get stuck in place and refuse to operate, usually after a long period of disuse. Drivesavers found that stuck armatures accounted for 11 percent of hard drive failures.

It seems counterintuitive that hard drives sitting quietly in a dark drawer might actually contribute to its failure, but I’ve seen many older hard drives pulled from a drawer and popped into a drive carrier or connected to power just go thunk. It does appear that hard drives like to be connected to power and constantly spinning and the numbers seem to bear this out.

Reason Four: Motor Failure

The last, and least common cause of hard drive failure, is hard drive motor failure, accounting for only 1 percent of failures, testament again to modern manufacturing precision and reliability.

Mitigating Hard Drive Failure Risk

So now that you’ve seen the gory numbers, here are a few recommendations to guard against the physical causes of hard drive failure.

1. Have a physical drive handling plan and follow it rigorously

If you must keep content on single hard drives in your location, make sure your team follows a few guidelines to protect against moisture, static electricity, and drops during drive handling. Keeping the drives in a dry location, storing the drives in static bags, using static discharge mats and wristbands, and putting rubber mats under areas where you’re likely to accidentally drop drives can all help.

It’s worth reviewing how you physically store drives, as well. Drivesavers tells us that the sudden impact of a heavy drawer of hard drives slamming home or yanked open quickly might possibly damage hard drives!

2. Spread failure risk across more drives and systems

Improving physical hard drive handling procedures is only a small part of a good risk-reducing strategy. You can immediately reduce the exposure of a single hard drive failure by simply keeping a copy of that valuable content on another drive.This is a common approach for videographers moving content from cameras shooting in the field back to their editing environment. By simply copying content over from one fast drive to another, the odds of both drives failing at once are less likely. This is certainly better than keeping content on only a single drive, but definitely not a great long-term solution.

Multiple drive NAS and RAID systems reduce the impact of failing drives even further. A RAID 6 system composed of eight drives not only has much faster read and write performance than a single drive, but two of its drives can fail and still serve your files, giving you time to replace those failed drives.

Mitigating Data Corruption Risk

The Risk of Bit Flips

Beyond physical damage, there’s another threat to the files stored on hard disks: small, silent bit flip errors often called data corruption or bit rot.

Bit rot errors occur when individual bits in a stream of data in files change from one state to another (positive or negative, 0 to 1, and vice versa). These errors can happen to hard drive and flash storage systems at rest, or be introduced as a file is copied from one hard drive to another.

While hard drives automatically correct single-bit flips on the fly, larger bit flips can introduce a number of errors. This can either cause the program accessing them to halt or throw an error, or perhaps worse, lead you to think that the file with the errors is fine!

Bit Flip Errors by the Book

In a landmark study of data failures in large systems, Disk failures in the real world:
What does an MTTF of 1,000,000 hours mean to you?
, Bianca Schroeder and Garth A. Gibson reported that “a large number of the problems attributed to CPU and memory failures were triggered by parity errors, i.e. the number of errors is too large for the embedded error correcting code to correct them.”

Flash drives are not immune either. Bianca Shroeder recently published a similar study of flash drives, Flash Reliability in Production: The Expected and the Unexpected, and found that “…between 20-63% of drives experienced at least one of the (unrecoverable read errors) during the time it was in production. In addition, between 2-6 out of 1,000 drive days were affected.”

“These UREs are almost exclusively due to bit corruptions that ECC cannot correct. If a drive encounters a URE, the stored data cannot be read. This either results in a failed read in the user’s code, or if the drives are in a RAID group that has replication, then the data is read from a different drive.”

Exactly how prevalent bit flips are is a controversial subject, but if you’ve ever retrieved a file from an old hard drive or RAID system and see sparkles in video, corrupt document files, or lines or distortions in pictures, you’ve seen the results of these errors.

Protecting Against Bit Flip Errors

There are many approaches to catching and correcting bit flip errors. From a system designer standpoint they usually involve some combination of multiple disk storage systems, multiple copies of content, data integrity checks and corrections, including error-correcting code memory, physical component redundancy, and a file system that can tie it all together.

Backblaze has built such a system, and uses a number of techniques to detect and correct file degradation due to bit flips and deliver extremely high data durability and integrity, often in conjunction with Reed-Solomon erasure codes.

Thanks to the way object storage and Backblaze B2 works, files written to B2 are always retrieved exactly as you originally wrote them. If a file ever changes from the time you’ve written it, say, due to bit flip errors, it will either be reproduced from a redundant copy of your file, or even mathematically reconstructed with erasure codes.

So the simplest, and certainly least expensive way to get bit flip protection for the content sitting on your hard drives is to simply have another copy on cloud storage.

Resources:

The Ideal Solution — Performance and Protection

With some thought, you can apply these protection steps to your environment and get the best of both worlds: the performance of your content on fast, local hard drives, and the protection of having a copy on object storage offsite with the ultimate data integrity.

The post The Shocking Truth — Managing for Hard Drive Failure and Data Corruption appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Backblaze Vaults: Zettabyte-Scale Cloud Storage Architecture

Post Syndicated from Brian Beach original https://www.backblaze.com/blog/vault-cloud-storage-architecture/

A lot has changed in the four years since Brian Beach wrote a post announcing Backblaze Vaults, our software architecture for cloud data storage. Just looking at how the major statistics have changed, we now have over 100,000 hard drives in our data centers instead of the 41,000 mentioned in the post video. We have three data centers (soon four) instead of one data center. We’re approaching one exabyte of data stored for our customers (almost seven times the 150 petabytes back then), and we’ve recovered over 41 billion files for our customers, up from the 10 billion in the 2015 post.

In the original post, we discussed having durability of seven nines. Shortly thereafter, it was upped to eight nines. In July of 2018, we took a deep dive into the calculation and found our durability closer to eleven nines (and went into detail on the calculations used to arrive at that number). And, as followers of our Hard Drive Stats reports will be interested in knowing, we’ve just started using our first 16 TB drives, which are twice the size of the biggest drives we used back at the time of this post — then a whopping eight TB.

We’ve updated the details here and there in the text from the original post that was published on our blog on March 11, 2015. We’ve left the original 135 comments intact, although some of them might be non sequiturs after the changes to the post. We trust that you will be able to sort out the old from the new and make sense of what’s changed. If not, please add a comment and we’ll be happy to address your questions.

— Editor

Storage Vaults form the core of Backblaze’s cloud services. Backblaze Vaults are not only incredibly durable, scalable, and performant, but they dramatically improve availability and operability, while still being incredibly cost-efficient at storing data. Back in 2009, we shared the design of the original Storage Pod hardware we developed; here we’ll share the architecture and approach of the cloud storage software that makes up a Backblaze Vault.

Backblaze Vault Architecture for Cloud Storage

The Vault design follows the overriding design principle that Backblaze has always followed: keep it simple. As with the Storage Pods themselves, the new Vault storage software relies on tried and true technologies used in a straightforward way to build a simple, reliable, and inexpensive system.

A Backblaze Vault is the combination of the Backblaze Vault cloud storage software and the Backblaze Storage Pod hardware.

Putting The Intelligence in the Software

Another design principle for Backblaze is to anticipate that all hardware will fail and build intelligence into our cloud storage management software so that customer data is protected from hardware failure. The original Storage Pod systems provided good protection for data and Vaults continue that tradition while adding another layer of protection. In addition to leveraging our low-cost Storage Pods, Vaults take advantage of the cost advantage of consumer-grade hard drives and cleanly handle their common failure modes.

Distributing Data Across 20 Storage Pods

A Backblaze Vault is comprised of 20 Storage Pods, with the data evenly spread across all 20 pods. Each Storage Pod in a given vault has the same number of drives, and the drives are all the same size.

Drives in the same drive position in each of the 20 Storage Pods are grouped together into a storage unit we call a tome. Each file is stored in one tome and is spread out across the tome for reliability and availability.

20 hard drives create 1 tome that share parts of a file.

Every file uploaded to a Vault is divided into pieces before being stored. Each of those pieces is called a shard. Parity shards are computed to add redundancy, so that a file can be fetched from a vault even if some of the pieces are not available.

Each file is stored as 20 shards: 17 data shards and three parity shards. Because those shards are distributed across 20 Storage Pods, the Vault is resilient to the failure of a Storage Pod.

Files can be written to the Vault when one pod is down and still have two parity shards to protect the data. Even in the extreme and unlikely case where three Storage Pods in a Vault lose power, the files in the vault are still available because they can be reconstructed from any of the 17 pods that are available.

Storing Shards

Each of the drives in a Vault has a standard Linux file system, ext4, on it. This is where the shards are stored. There are fancier file systems out there, but we don’t need them for Vaults. All that is needed is a way to write files to disk and read them back. Ext4 is good at handling power failure on a single drive cleanly without losing any files. It’s also good at storing lots of files on a single drive and providing efficient access to them.

Compared to a conventional RAID, we have swapped the layers here by putting the file systems under the replication. Usually, RAID puts the file system on top of the replication, which means that a file system corruption can lose data. With the file system below the replication, a Vault can recover from a file system corruption because a single corrupt file system can lose at most one shard of each file.

Creating Flexible and Optimized Reed-Solomon Erasure Coding

Just like RAID implementations, the Vault software uses Reed-Solomon erasure coding to create the parity shards. But, unlike Linux software RAID, which offers just one or two parity blocks, our Vault software allows for an arbitrary mix of data and parity. We are currently using 17 data shards plus three parity shards, but this could be changed on new vaults in the future with a simple configuration update.

Vault Row of Storage Pods

For Backblaze Vaults, we threw out the Linux RAID software we had been using and wrote a Reed-Solomon implementation from scratch, which we wrote about in Backblaze Open Sources Reed-Solomon Erasure Coding Source Code. It was exciting to be able to use our group theory and matrix algebra from college.

The beauty of Reed-Solomon is that we can then re-create the original file from any 17 of the shards. If one of the original data shards is unavailable, it can be re-computed from the other 16 original shards, plus one of the parity shards. Even if three of the original data shards are not available, they can be re-created from the other 17 data and parity shards. Matrix algebra is awesome!

Handling Drive Failures

The reason for distributing the data across multiple Storage Pods and using erasure coding to compute parity is to keep the data safe and available. How are different failures handled?

If a disk drive just up and dies, refusing to read or write any data, the Vault will continue to work. Data can be written to the other 19 drives in the tome, because the policy setting allows files to be written as long as there are two parity shards. All of the files that were on the dead drive are still available and can be read from the other 19 drives in the tome.

Building a Backblaze Vault Storage Pod

When a dead drive is replaced, the Vault software will automatically populate the new drive with the shards that should be there; they can be recomputed from the contents of the other 19 drives.

A Vault can lose up to three drives in the same tome at the same moment without losing any data, and the contents of the drives will be re-created when the drives are replaced.

Handling Data Corruption

Disk drives try hard to correctly return the data stored on them, but once in a while they return the wrong data, or are just unable to read a given sector.

Every shard stored in a Vault has a checksum, so that the software can tell if it has been corrupted. When that happens, the bad shard is recomputed from the other shards and then re-written to disk. Similarly, if a shard just can’t be read from a drive, it is recomputed and re-written.

Conventional RAID can reconstruct a drive that dies, but does not deal well with corrupted data because it doesn’t checksum the data.

Scaling Horizontally

Each vault is assigned a number. We carefully designed the numbering scheme to allow for a lot of vaults to be deployed, and designed the management software to handle scaling up to that level in the Backblaze data centers.

The overall design scales very well because file uploads (and downloads) go straight to a vault, without having to go through a central point that could become a bottleneck.

There is an authority server that assigns incoming files to specific Vaults. Once that assignment has been made, the client then uploads data directly to the Vault. As the data center scales out and adds more Vaults, the capacity to handle incoming traffic keeps going up. This is horizontal scaling at its best.

We could deploy a new data center with 10,000 Vaults holding 16TB drives and it could accept uploads fast enough to reach its full capacity of 160 exabytes in about two months!

Backblaze Vault Benefits

The Backblaze Vault architecture has six benefits:

1. Extremely Durable

The Vault architecture is designed for 99.999999% (eight nines) annual durability (now 11 nines — Editor). At cloud-scale, you have to assume hard drives die on a regular basis, and we replace about 10 drives every day. We have published a variety of articles sharing our hard drive failure rates.

The beauty with Vaults is that not only does the software protect against hard drive failures, it also protects against the loss of entire Storage Pods or even entire racks. A single Vault can have three Storage Pods — a full 180 hard drives — die at the exact same moment without a single byte of data being lost or even becoming unavailable.

2. Infinitely Scalable

A Backblaze Vault is comprised of 20 Storage Pods, each with 60 disk drives, for a total of 1200 drives. Depending on the size of the hard drive, each vault will hold:

12TB hard drives => 12.1 petabytes/vault (Deploying today.)
14TB hard drives => 14.2 petabytes/vault (Deploying today.)
16TB hard drives => 16.2 petabytes/vault (Small-scale testing.)
18TB hard drives => 18.2 petabytes/vault (Announced by WD & Toshiba)
20TB hard drives => 20.2 petabytes/vault (Announced by Seagate)

Backblaze Data Center

At our current growth rate, Backblaze deploys one to three Vaults each month. As the growth rate increases, the deployment rate will also increase. We can incrementally add more storage by adding more and more Vaults. Without changing a line of code, the current implementation supports deploying 10,000 Vaults per location. That’s 90 exabytes of data in each location. The implementation also supports up to 1,000 locations, which enables storing a total of 90 zettabytes! (Also knowWithout changing a line of code, the current implementation supports deploying 10,000 Vaults per location. That’s 160 exabytes of data in each location. The implementation also supports up to 1,000 locations, which enables storing a total of 160 zettabytes! (Also known as 160,000,000,000,000 GB.)

3. Always Available

Data backups have always been highly available: if a Storage Pod was in maintenance, the Backblaze online backup application would contact another Storage Pod to store data. Previously, however, if a Storage Pod was unavailable, some restores would pause. For large restores this was not an issue since the software would simply skip the Storage Pod that was unavailable, prepare the rest of the restore, and come back later. However, for individual file restores and remote access via the Backblaze iPhone and Android apps, it became increasingly important to have all data be highly available at all times.

The Backblaze Vault architecture enables both data backups and restores to be highly available.

With the Vault arrangement of 17 data shards plus three parity shards for each file, all of the data is available as long as 17 of the 20 Storage Pods in the Vault are available. This keeps the data available while allowing for normal maintenance and rare expected failures.

4. Highly Performant

The original Backblaze Storage Pods could individually accept 950 Mbps (megabits per second) of data for storage.

The new Vault pods have more overhead, because they must break each file into pieces, distribute the pieces across the local network to the other Storage Pods in the vault, and then write them to disk. In spite of this extra overhead, the Vault is able to achieve 1,000 Mbps of data arriving at each of the 20 pods.

Backblaze Vault Networking

This capacity required a new type of Storage Pod that could handle this volume. The net of this: a single Vault can accept a whopping 20 Gbps of data.

Because there is no central bottleneck, adding more Vaults linearly adds more bandwidth.

5. Operationally Easier

When Backblaze launched in 2008 with a single Storage Pod, many of the operational analyses (e.g. how to balance load) could be done on a simple spreadsheet and manual tasks (e.g. swapping a hard drive) could be done by a single person. As Backblaze grew to nearly 1,000 Storage Pods and over 40,000 hard drives, the systems we developed to streamline and operationalize the cloud storage became more and more advanced. However, because our system relied on Linux RAID, there were certain things we simply could not control.

With the new Vault software, we have direct access to all of the drives and can monitor their individual performance and any indications of upcoming failure. And, when those indications say that maintenance is needed, we can shut down one of the pods in the Vault without interrupting any service.

6. Astoundingly Cost Efficient

Even with all of these wonderful benefits that Backblaze Vaults provide, if they raised costs significantly, it would be nearly impossible for us to deploy them since we are committed to keeping our online backup service affordable for completely unlimited data. However, the Vault architecture is nearly cost neutral while providing all these benefits.

Backblaze Vault Cloud Storage

When we were running on Linux RAID, we used RAID6 over 15 drives: 13 data drives plus two parity. That’s 15.4% storage overhead for parity.

With Backblaze Vaults, we wanted to be able to do maintenance on one pod in a vault and still have it be fully available, both for reading and writing. And, for safety, we weren’t willing to have fewer than two parity shards for every file uploaded. Using 17 data plus three parity drives raises the storage overhead just a little bit, to 17.6%, but still gives us two parity drives even in the infrequent times when one of the pods is in maintenance. In the normal case when all 20 pods in the Vault are running, we have three parity drives, which adds even more reliability.

Summary

Backblaze’s cloud storage Vaults deliver 99.999999% (eight nines) annual durability (now 11 nines — Editor), horizontal scalability, and 20 Gbps of per-Vault performance, while being operationally efficient and extremely cost effective. Driven from the same mindset that we brought to the storage market with Backblaze Storage Pods, Backblaze Vaults continue our singular focus of building the most cost-efficient cloud storage available anywhere.

•  •  •

Note: This post was updated from the original version posted on March 11, 2015.

The post Backblaze Vaults: Zettabyte-Scale Cloud Storage Architecture appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Connect Veeam to the B2 Cloud: Episode 4 — Using Morro Data CloudNAS

Post Syndicated from Roderick Bauer original https://www.backblaze.com/blog/connect-veeam-to-the-b2-cloud-episode-4-using-morro-data-cloudnas/

Veeam backup to Backblaze B2 Episode 4 of Series

In the fourth post in our series on connecting Veeam with B2, we provide a guide on how to back up your VMs to Backblaze B2 using Veeam and Morro Data’s CloudNAS. In our previous posts, we covered how to connect Veeam to the B2 cloud using OpenDedupe, connect Veeam to the B2 cloud using Synology, and connect Veeam with B2 using StarWind VTL.

VM Backup to B2 Using Veeam Backup & Replication and Morro Data CloudNAS

We are glad to show how Veeam Backup & Replication can work with Morro Data CloudNAS to keep the more recent backups on premises for fast recovery while archiving all backups in B2 Cloud Storage. CloudNAS not only caches the more recent backup files, but also simplifies the management of B2 Cloud Storage with a network share or drive letter interface.

–Paul Tien, Founder & CEO, Morro Data

VM backup and recovery is a critical part of IT operations that supports business continuity. Traditionally, IT has deployed an array of purpose-built backup appliances and applications to protect against server, infrastructure, and security failures. As VMs continue to spread in production, development, and verification environments, the expanding VM backup repository has become a major challenge for system administrators.

Because the VM backup footprint is usually quite large, cloud storage is increasingly being deployed for VM backup. However, cloud storage does not achieve the same performance level as on-premises storage for recovery operation. For this reason, cloud storage has been used as tiered repository behind on-premises storage.

diagram of Veeam backing up to B2 using Cloudflare and Morro Data CloudNAS

In this best practice guide, VM Backup to B2 Using Veeam Backup & Replication and Morro Data CloudNAS, we will show how Veeam Backup & Replication can work with Morro Data CloudNAS to keep the most recent backups on premises for fast recovery while archiving all backups in the retention window in Backblaze B2 cloud storage. CloudNAS caching not only provides buffer for most recent backup files, but also simplifies the management of on-premises storage and cloud storage as an integral backup repository.

Tell Us How You’re Backing Up Your VMs

If you’re backing up VMs to B2 using one of the solutions we’ve written about in this series, we’d like to hear from you in the comments about how it’s going.

View all posts in the Veeam series.

The post Connect Veeam to the B2 Cloud: Episode 4 — Using Morro Data CloudNAS appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Survey Says: Cloud Storage Makes Strong Gains for Media & Entertainment

Post Syndicated from Janet Lafleur original https://www.backblaze.com/blog/cloud-storage-makes-strong-gains-in-media-entertainment/

Survey Reveals Growing Adoption of Cloud Storage by Media & Entertainment

Where Does the Media Industry Really Use Cloud Storage?

Our new cloud survey results might surprise you.

Predicting which promising new technologies will be adopted quickly, which ones will take longer, and which ones will fade away is not always easy. When the iPhone was introduced in 2007, only 6% of the US population had smartphones. In less than 10 years, over 80% of Americans owned smartphones. In contrast, video telephone calls demonstrated at the 1964 New York World’s Fair only became commonplace 45 years later with the advent of FaceTime. And those flying cars people have dreamed of since the 1950s? Don’t hold your breath.

What about cloud storage? Who is adopting it today and for what purposes?

“While M&E professionals are not abandoning existing storage alternatives, they increasingly see the public cloud in storage applications as simply another professional tool to achieve their production, distribution, and archiving goals. For the future, that trend looks to continue as the public cloud takes on an even greater share of their overall storage requirements.”

— Phil Kurz, contributing editor, TV Technology

At Backblaze, we have a front-line view of how customers use cloud for storage. And based on the media-oriented customers we’ve directly worked with to integrate cloud storage, we know they’re using cloud storage throughout the workflow: backing up files during content creation (UCSC Silicon Valley), managing production storage more efficiently (WunderVu), archiving of historical content libraries (Austin City Limits), hosting media files for download (American Public Television), and even editing cloud-based video (Everwell).

We wanted to understand more about how the broader industry uses cloud storage and their beliefs and concerns about it, so we could better serve the needs of our current customers and anticipate what their needs will be in the future.

We decided to sponsor an in-depth survey with TV Technology, a media company that for over 30 years has been an authority for news, analysis and trend reports serving the media and entertainment industries. While TV Technology had conducted a similar survey in 2015, we thought it’d be interesting to see how the industry outlook has evolved. Based on our 2019 results, it certainly has. As a quick example, security was a concern for 71% of respondents in 2015. This year, only 38% selected security as an issue at all.

Survey Methodology — 246 Respondents and 15 Detailed Questions

For the survey, TV Technology queried 246 respondents, primarily from production and post-production studios and broadcasters, but also other market segments including corporate video, government, and education. See chart below for the breakdown. Respondents were asked 15 questions about their cloud storage usage today and in the future, and for what purpose. The survey queried what motivated their move to the cloud, their expectations for access times and cost, and any obstacles that are preventing further cloud adoption.

Types of businesses responding to survey

Survey Insights — Half Use Public Cloud Today — Cloud the Top Choice for Archive

Overall, the survey reveals growing cloud adoption for media organizations who want to improve production efficiency and to reduce costs. Key findings from the report include:

  • On the whole, about half of the respondents from all organization types are using public cloud services. Sixty-four percent of production/post studio respondents say they currently use the cloud. Broadcasters report lower adoption, with only 26 percent using the public cloud.
  • Achieving greater efficiency in production was cited by all respondents as the top reason for adopting the cloud. However, while this is also important to broadcasters, their top motivator for cloud use is cost containment or internal savings programs.
  • Cloud storage is clearly the top choice for archiving media assets, with 70 percent choosing the public cloud for active, deep, or very deep archive needs.
  • Concerns over the security of assets stored in a public cloud remain, however they have been assuaged greatly compared to the 2015 report, so much so that they are no longer the top obstacle to cloud adoption. For 40%, pricing has replaced security as the top concern.

These insights only scratch the surface of the survey’s findings, so we’re making the full 12 page report available to everyone. To get a deeper look and compare your experiences to your peers as a content creator or content owner, download and read Cloud Storage Technologies Establish Their Place Among Alternatives for Media today.

How are you using cloud storage today? How do you think that will change three years from now? Please tell us in the comments.

The post Survey Says: Cloud Storage Makes Strong Gains for Media & Entertainment appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.