Tag Archives: Western Digital

The Helium Factor and Hard Drive Failure Rates

Post Syndicated from Andy Klein original https://www.backblaze.com/blog/helium-filled-hard-drive-failure-rates/

Seagate Enterprise Capacity 3.5 Helium HDD

In November 2013, the first commercially available helium-filled hard drive was introduced by HGST, a Western Digital subsidiary. The 6 TB drive was not only unique in being helium-filled, it was for the moment, the highest capacity hard drive available. Fast forward a little over 4 years later and 12 TB helium-filled drives are readily available, 14 TB drives can be found, and 16 TB helium-filled drives are arriving soon.

Backblaze has been purchasing and deploying helium-filled hard drives over the past year and we thought it was time to start looking at their failure rates compared to traditional air-filled drives. This post will provide an overview, then we’ll continue the comparison on a regular basis over the coming months.

The Promise and Challenge of Helium Filled Drives

We all know that helium is lighter than air — that’s why helium-filled balloons float. Inside of an air-filled hard drive there are rapidly spinning disk platters that rotate at a given speed, 7200 rpm for example. The air inside adds an appreciable amount of drag on the platters that in turn requires an appreciable amount of additional energy to spin the platters. Replacing the air inside of a hard drive with helium reduces the amount of drag, thereby reducing the amount of energy needed to spin the platters, typically by 20%.

We also know that after a few days, a helium-filled balloon sinks to the ground. This was one of the key challenges in using helium inside of a hard drive: helium escapes from most containers, even if they are well sealed. It took years for hard drive manufacturers to create containers that could contain helium while still functioning as a hard drive. This container innovation allows helium-filled drives to function at spec over the course of their lifetime.

Checking for Leaks

Three years ago, we identified SMART 22 as the attribute assigned to recording the status of helium inside of a hard drive. We have both HGST and Seagate helium-filled hard drives, but only the HGST drives currently report the SMART 22 attribute. It appears the normalized and raw values for SMART 22 currently report the same value, which starts at 100 and goes down.

To date only one HGST drive has reported a value of less than 100, with multiple readings between 94 and 99. That drive continues to perform fine, with no other errors or any correlating changes in temperature, so we are not sure whether the change in value is trying to tell us something or if it is just a wonky sensor.

Helium versus Air-Filled Hard Drives

There are several different ways to compare these two types of drives. Below we decided to use just our 8, 10, and 12 TB drives in the comparison. We did this since we have helium-filled drives in those sizes. We left out of the comparison all of the drives that are 6 TB and smaller as none of the drive models we use are helium-filled. We are open to trying different comparisons. This just seemed to be the best place to start.

Lifetime Hard Drive Failure Rates: Helium vs. Air-Filled Hard Drives table

The most obvious observation is that there seems to be little difference in the Annualized Failure Rate (AFR) based on whether they contain helium or air. One conclusion, given this evidence, is that helium doesn’t affect the AFR of hard drives versus air-filled drives. My prediction is that the helium drives will eventually prove to have a lower AFR. Why? Drive Days.

Let’s go back in time to Q1 2017 when the air-filled drives listed in the table above had a similar number of Drive Days to the current number of Drive Days for the helium drives. We find that the failure rate for the air-filled drives at the time (Q1 2017) was 1.61%. In other words, when the drives were in use a similar number of hours, the helium drives had a failure rate of 1.06% while the failure rate of the air-filled drives was 1.61%.

Helium or Air?

My hypothesis is that after normalizing the data so that the helium and air-filled drives have the same (or similar) usage (Drive Days), the helium-filled drives we use will continue to have a lower Annualized Failure Rate versus the air-filled drives we use. I expect this trend to continue for the next year at least. What side do you come down on? Will the Annualized Failure Rate for helium-filled drives be better than air-filled drives or vice-versa? Or do you think the two technologies will be eventually produce the same AFR over time? Pick a side and we’ll document the results over the next year and see where the data takes us.

The post The Helium Factor and Hard Drive Failure Rates appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

A History of Hard Drives

Post Syndicated from Peter Cohen original https://www.backblaze.com/blog/history-hard-drives/

history-of-hard-drive
2016 marks the 60th anniversary of the venerable Hard Disk Drive (HDD). While new computers increasingly turn to Solid State Disks (SSDs) for main storage, HDDs remain the champions of low-cost, high-capacity data storage. That’s a big reason why we still use them in our Storage Pods. Let’s take a spin the Wayback Machine and take a look at the history of hard drives. Let’s also think about what the future might hold.

It Started With RAMAC

brl61-ibm_305_ramac

IBM made the first commercial hard disk drive-based computer and called it RAMAC – short for “Random Access Method of Accounting And Control.” Its storage system was called the IBM 350. RAMAC was big – it required an entire room to operate. The hard disk drive storage system alone was about the size of two refrigerators. Inside were stacked 50 24-inch platters.

For that, RAMAC customers ended up with less than 5 MB – that’s right, megabytes of storage. IBM’s marketing people didn’t want to make RAMAC store any more data than that. They had no idea how to convince customers they’d need more storage than that.

IBM customers forked over $3,200 for the privilege of accessing and storing that information. A MONTH. (IBM leased its systems.) That’s equivalent to almost $28,000 per month in 2016.

Sixty years ago, data storage cost $640 per megabyte, per month. At IBM’s 1956 rates for storage, a new iPhone 7 would cost you about $20.5 million a month. RAMAC was a lot harder to stick in your pocket, too.

Plug and Play

These days you can fit 2 TB onto an SD card the size of a postage stamp, but half a century ago, it was a very different story. IBM continued to refine early hard disk drive storage, but systems were still big and bulky.

By the early 1960s, IBM’s mainframe customers were hungry for more storage capacity, but they simply didn’t have the room to keep installing refrigerator-sized storage devices. So the smart folks at IBM came up with a solution: Removable storage.

The IBM 1311 Disk Storage Drive, introduced in 1962, gave rise to the use of IBM 1316 “Disk Packs” that let IBM’s mainframe customers expand their storage capacity as much as they needed (or could afford). IBM shrank the size of the disks dramatically, from 24 inches in diameter down to 14 inches. The 9-pound disk packs fit into a device about the size of a modern washing machine. Each pack could hold about 2 MB.

ibm_1311-1

For my part, I remember touring a data center as a kid in the mid-1970s and seeing removable IBM disk packs up close. They looked about the same size and dimensions that you’d use to carry a birthday cake: Large, sealed plastic containers with handles on the top.

Computers had pivoted from expensive curiosities in the business world to increasingly essential devices needed to get work done. IBM’s System/360 proved to be an enormously popular and influential mainframe computer. IBM created different models but needed flexible storage across the 360 product line. So IBM created a standard hard disk device interconnect. Other manufacturers adopted the technology, and a cottage industry was born: Third-party hard disk drive storage.

The PC Revolution

Up until the 1970s, computers were huge, expensive, very specialized devices only the biggest businesses, universities and government institutions could afford. The dropping price of electronic components, the increasing density of memory chips and other factors gave rise to a brand new industry: The personal computer.

Initially, personal computers had very limited, almost negligible storage capabilities. Some used perforated paper tape for storage. Others used audio cassettes. Eventually, personal computers would write data to floppy disk drives. And over time, the cost of hard disk drives fell enough that PC users could have one, too.

winchester-festplatte

In 1980, a young upstart company named Shugart Technology introduced a 5 MB hard disk drive designed to fit into personal computers of the day. It was a scant 5.25 inches in diameter. The drive cost $1,500. It would prove popular enough to become a de facto standard for PCs throughout the 1980s. Shugart changed its name to Seagate Technology. Yep. That Seagate.

In the space of 25 years, hard drive technology had shrunk from a device the size of a refrigerator to something less than 6 inches in diameter. And that would be nothing compared to what was to come in the next 25 years.

The Advent of RAID

An important chapter in Backblaze’s backstory appears in the late 1980s when three computer scientists from U.C. Berkeley coined the term “RAID” in a research paper presented at the SIGMOD conference, an annual event which still happens today.

RAID is an acronym that stands for “Redundant Array of Inexpensive Disks.” The idea is that you can take several discrete storage devices – hard disk drives, in this case – and combine them into a single logical unit. Dividing the work of writing and reading data between multiple devices can make data move faster. It can also reduce the likelihood that you’ll lose data.

The Berkeley researchers weren’t the first to come up with the idea, which had bounced around since the 1970s. They did coin the acronym that we still use today.

blog_60_drives

RAID is vitally important for Backblaze. RAID is how we build our Storage Pods. Our latest Storage Pod design incorporates 60 individual hard drives assembled in 4 RAID arrays. Backblaze then took the concept a further by implementing our own Reed-Solomon erasure coding mechanism to work across our Backblaze Vaults.

With our latest Storage Pod design we’ve been able to squeeze 480 TB into a single chassis that occupies 4U of rack space, or about 7 inches of vertical height in an equipment rack. That’s a far cry from RAMAC’s 5 MB of refrigerator-sized storage. 96 million times more storage, in fact.

Bigger, Better, Faster, More

Throughout the 1980s and 1990s, hard drive and PC makers innovated and changed the market irrevocably. 5.25-inch drives soon gave way to 3.5-inch drives (we at Backblaze still use 3.5-inch drives designed for modern desktop computers in our Storage Pods). When laptops gained in popularity, drives shrunk again to 2.5 inches. If you’re using a laptop that has a hard drive today, chances are it’s a 2.5-inch model.

The need for better, faster, more reliable and flexible storage also gave rise to different interfaces: IDE, SCSI, ATA, SATA, PCIe. Drive makers improved performance by increasing the spindle speed. the speed of the motor that turns the hard drive. 5,400 revolutions per minute (RPM) was standard, but 7,200 yielded better performance. Seagate, Western Digital, and others upped the ante by introducing 10,0000-RPM and eventually 15,000-RPM drives.

IBM pioneered the commercial hard drive and brought countless hard disk drive innovations to market over the decades. In 2003, IBM sold its storage division to Hitachi. The many Hitachi drives we use here at Backblaze can trace their lineage back to IBM.

Solid State Drives

Even as hard drives found a place in early computer systems, RAM-based storage systems were also being created. The prohibitively high cost of computer memory, its complexity, size, and requirement to stay powered to work prevented memory-based storage from catching on in any meaningful way. Though very specialized, expensive systems found use in the supercomputing and mainframe computer markets.

Eventually non-volatile RAM became fast, reliable and inexpensive enough that SSDs could be mass-produced, but it was still by degrees. They were incredibly expensive. By the early 1990s, you could buy a 20 MB SSD for a PC for $1,000, or about $50 per megabyte. By comparison, the cost of a spinning hard drive had dropped below $1 per megabyte, and would plummet even further.

blog-ssd-closeup

The real breakthrough happened with the introduction of flash-based SSDs. By the mid-2000s, Samsung, SanDisk and others brought to market flash SSDs that acted as drop-in replacements for hard disk drives. SSDs have gotten faster, smaller and more plentiful. Now PCs and Macs and smartphones all include flash storage of all shapes and sizes and will continue to move in that direction. SSDs provide better performance, better power efficiency, and enable thinner, lighter computer designs, so it’s little wonder.

The venerable spinning hard drive, now 60 years old, still rules the roost when it comes to cost per gigabyte. SSD makers are getting closer to parity with hard drives, but they’re still years away from hitting that point. An old fashioned spinning hard drive still gives you the best bang for your buck.

We can dream, though. Over the summer our Andy Klein got to wondering what Seagate’s new 60 TB SSD might look like in one of our Storage Pods. He had to guess at the price but based on current market estimates, an SSD-based 60-drive Storage Pod would cost Backblaze about $1.2 million.

Andy didn’t make any friends in Backblaze’s Accounting department with that news, so it’s probably not going to happen any time soon.

The Future

As computers and mobile devices have pivoted from hard drives to SSDs, it’s easy to discount the hard drive as a legacy technology that will soon be by the wayside. I’d encourage some circumspection, though. It seems every few years, someone declares the hard drive dead. Meanwhile hard drive makers keep finding ways to stay relevant.

There’s no question that the hard drive market is in a period of decline and transition. Hard disk drive sales are down year-over-year. Consumers switch to SSD or move away from Macs and PCs altogether and do more of their work on mobile devices.

Regardless, Innovation and development of hard drives continue apace. We’re populating our own Storage Pods with 8 TB hard drives. 10 TB hard drives are already shipping, and even higher-capacity 3.5-inch drives are on the horizon.

Hard drive makers constantly improve areal density – the amount of information you can physically cram onto a disk. They’ve also found ways to get more platters into a single drive mechanism then filling it with helium. This sadly does not make the drive float, dashing my fantasies of creating a Backblaze data center blimp.

So is SSD the only future for data storage? Not for a while. Seagate still firmly believes in the future of hard drives. Its CFO estimates that hard drives will be around for another 15-20 years. Researchers predict that hard drives coming to market over the next decade will store an order of magnitude more data than they do now – 100 TB or more.

Think it’s out of the question? Imagine handing a 10 TB hard drive to a RAMAC operator in 1956 and telling them that the 3.5-inch device in their hands holds two million times more data than that big box in front of them. They’d think you were nuts.

The post A History of Hard Drives appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Hard Drive Stats for Q2 2016

Post Syndicated from Andy Klein original https://www.backblaze.com/blog/hard-drive-failure-rates-q2-2016/

Hard Drive Reliability

Q2 2016 saw Backblaze: introduce 8TB drives into our drive mix, kickoff a pod-to-vault migration of over 6.5 Petabytes of data, cross over 250 Petabytes of data stored, and deploy another 7,290 drives into the data center for a total of 68,813 spinning hard drives under management. With all the ins-and-outs let’s take a look at how our hard drives fared in Q2 2016.

Backblaze hard drive reliability for Q2 2016

Below is the hard drive failure data for Q2 2016. This chart is just for the period of Q2 2016. The hard drive models listed below are data drives (not boot drives), and we only list models which have 45 or more drives of that model deployed.

Q2 2016 Hard Drive Failures RatesA couple of observations on the chart:

  1. The models that have an annualized failure rate of 0.00% had zero hard drive failures in Q2 2016.
  2. The annualized failure rate is computed as follows: ((Failures)/(Drive Days/365)) * 100. Therefore consider the number of “Failures” and “Drive Days” before reaching any conclusions about the failure rate.

Later in this post we’ll review the cumulative statistics for all of our drives over time, but first let’s take a look at the new drives on the block.

The 8TB hard drives have arrived

For the last year or so we kept saying we were going to deploy 8TB drives in quantity. We did deploy 45 8TB HGST drives, but deploying these drives en masse did not make economic sense for us. Over the past quarter, 8TB drives from Seagate became available at a reasonable price, so we purchased and deployed over 2,700 in Q2 with more to come in Q3. All of these drives were deployed in Backblaze Vaults with each vault using 900 drives, that’s 45 drives in each of the 20 Storage Pods that form a Backblaze Vault.

Yes, we said 45 drives in each storage pod, so what happened to our 60 drive Storage Pods? In short, we wanted to use the remaining stock of 45 drive Storage Pods before we started using the 60 drive pods. We have built two Backblaze Vaults using the 60 drive pods, but we filled them with 4- and 6TB drives. The first 60 drive Storage Pod filled with 8TB drives (total 480TB) will be deployed shortly.

Hard Drive Migration – 85 Pods to 1 Vault

One of the reasons that we made the move to 8TB drives was to optimize storage density. We’ve done data migrations before, for example, from 1TB pods to 3TB and 4TB pods. These migrations were done one or two Storage Pods at a time. It was time to “up our game.” We decided to migrate from individual Storage Pods filled with HGST 2TB drives, average age 64 months, to a Backblaze Vault filled with 900 8TB drives.

Backblaze Data Migration

We identified and tagged 85 individual Storage Pods to migrate from. Yes, 85. The total amount of data to be migrated was about 6.5PB. It was a bit sad to see the 2TB HGST drives go as they have been really good over the years, but getting 4 times as much data into the same space was just too hard to resist.

The first step is to stop all data writes on the donor HGST 2TB Storage Pods. We then kicked off the migration by starting with 10 Storage Pods. We then added 10 to 20 donor pods to the migration every few hours until we got to 85 pods. The migration process is purposely slow as we want to ensure that we can still quickly read files from the 85 donor pods so that data restores are not impacted. The process is to copy a given RAID-array from a Storage Pod to a specific “Tome” in a Backblaze Vault. Once all the data in a given RAID-array has been copied to a Tome, we move on to the next RAID-array awaiting migration and continue the process. This happens in parallel across the 45 Tomes in a Backblaze Vault.

We’re about 50% of the way through the migration with little trouble. We did have a Storage Pod in the Backblaze Vault go down. That didn’t stop the migration, as vaults are designed to continue to operate under such conditions, but more on that in another post.

250 Petabytes of data stored

Recently we took a look at the growth of data and the future of cloud storage. Given the explosive growth in data as a whole it’s not surprising that Backblaze added another 50PB of customer data over the last 2 quarters and that by mid-June we had passed the 250 Petabyte mark in total data stored. You can see our data storage growth below:

Backblaze Data Managed

Back in December 2015, we crossed over the 200 Petabyte mark and at that time predicted we would cross 250PB in early Q3 2016. So we’re a few weeks early. We also predicted we would cross 300PB in late 2016. Given how much data we are adding with B2, it will probably be sooner, we’ll see.

Cumulative hard drive failure rates by model

In the table below we’ve computed the annualized drive failure rate for each drive model. This is based on data from April 2013 through June 2016.

Q2 2016 Cumulative Hard Drive Failure Rates

Some people question the usefulness of the cumulative Annualized Failure Rate. This is usually based on the idea that drives entering or leaving during the cumulative period skew the results because they are not there for the entire period. This is one of the reasons we compute the Annualized Failure Rate using “Drive Days”. A Drive Day is only recorded if the drive is present in the system. For example, if a drive is installed on July 1st and fails on August 31st, it adds 62 drive days and 1 drive failure to the overall results. A drive can be removed from the system because it fails or perhaps it is removed from service after a migration like the 2TB HGST drives we’ve covered earlier. In either case, the drive stops adding Drive Days to the total, allowing us to compute an Annualized Failure Rate over the cumulative period based on what each of the drives contributed during that period.

As always, we’ve published the Q2 2016 data we used to compute these drive stats. You can find the data files along with the associated documentation on our hard drive test data page.

Which hard drives do we use?

We’ve written previously about our difficulties in getting drives from Toshiba and Western Digital. Whether it’s poor availability or an unexplained desire not to sell us drives, we don’t have many drives from either manufacturer. So we use a lot of Seagate drives and they are doing the job very nicely. The table below shows the distribution of the hard drives we are currently using in our data center.

Q2 2016 Hard Drive Distribution

Recap

The Seagate 8TB drives are here and are looking good. Sadly we’ll be saying goodbye to the HGST 2TB drives, but we need the space. We’ll miss those drives, they were rock stars for us. The 4TB Seagate drives are our workhorse drives today and their 2.8% annualized failure rate is more than acceptable for us. Their low failure rate roughly translates to an average of one drive failure per Storage Pod per year. Over the next few months expect more on our migrations, a look at the day in the life of a data center tech, and an update of the “bathtub” curve, i.e. hard drive failure over time.

The post Hard Drive Stats for Q2 2016 appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Plan Bee

Post Syndicated from Alex Bate original https://www.raspberrypi.org/blog/plan-bee/

Bees are important. I find myself saying this a lot and, slowly but surely, the media seems to be coming to this realisation too. The plight of the bee is finally being brought to our attention with increasing urgency.

A colony of bees make honey

Welcome to the house of buzz.

In the UK, bee colonies are suffering mass losses. Due to the use of bee-killing fertilisers and pesticides within the farming industry, the decline of pollen-rich plants, the destruction of hives by mites, and Colony Collapse Disorder (CCD), bees are in decline at a worrying pace.

Bee Collision

When you find the perfect GIF…

One hint of a silver lining is that increasing awareness of the crisis has led to a rise in the number of beekeeping hobbyists. As getting your hands on some bees is now as simple as ordering a box from the internet, keeping bees in your garden is a much less daunting venture than it once was. 

Taking this one step further, beekeepers are now using tech to monitor the conditions of their bees, improving conditions for their buzzy workforce while also recording data which can then feed into studies attempting to lessen the decline of the bee.

WDLabs recently donated a PiDrive to the Honey Bee Gardens Project in order to help beekeeper David Ammons and computer programmer Graham Total create The Hive Project, an electric beehive colony that monitors real-time bee data.

Electric Bee Hive

The setup records colony size, honey production, and bee health to help combat CCD.

Colony Collapse Disorder (CCD) is decidedly mysterious. Colonies hit by the disease seem to simply disappear. The hive itself often remains completely intact, full of honey at the perfect temperature, but… no bees. Dead or alive, the bees are nowhere to be found.

To try to combat this phenomenon, the electric hive offers 24/7 video coverage of the inner hive, while tracking the conditions of the hive population.

Bee bringing pollen into the hive

This is from the first live day of our instrumented beehive. This was the only bee we spotted all day that brought any pollen into the hive.

Ultimately, the team aim for the data to be crowdsourced, enabling researchers and keepers to gain the valuable information needed to fight CCD via a network of electric hives. While many people blame the aforementioned pollen decline and chemical influence for the rise of CCD, without the empirical information gathered from builds such as The Hive Project, the source of the problem, and therefore the solution, can’t be found.

Bee making honey

It has been brought to our attention that the picture here previously was of a wasp doing bee things. We have swapped it out for a bee.

 

 

Ammons and Total researched existing projects around the use of digital tech within beekeeping, and they soon understood that a broad analysis of bee conditions didn’t exist. While many were tracking hive weight, temperature, or honey population, there was no system in place for integrating such data collection into one place. This realisation spurred them on further.

“We couldn’t find any one project that took a broad overview of the whole area. Even if we don’t end up being the people who implement it, we intend to create a plan for a networked system of low-cost monitors that will assist both research and commercial beekeeping.”

With their mission statement firmly in place, the duo looked toward the Raspberry Pi as the brain of their colony. Finding the device small enough to fit within the hive without disruption, the power of the Pi allowed them to monitor multiple factors while also using the Pi Camera Module to record all video to the 314GB storage of the Western Digital PiDrive.

Data recorded by The Hive Project is vital to the survival of the bee, the growth of colony population, and an understanding of the conditions of the hive in changing climates. These are issues which affect us all. The honey bee is responsible for approximately 80% of pollination in the UK, and is essential to biodiversity. Here, I should hand over to a ‘real’ bee to explain more about the importance of bee-ing…

Bee Movie – Devastating Consequences – HD

Barry doesn’t understand why all the bee aren’t happy. Then, Vanessa shows Barry the devastating consequences of the bees being triumphant in their lawsuit against the human race.

 

The post Plan Bee appeared first on Raspberry Pi.