Tag Archives: Cloud Storage

Announcing Facebook Photo and Video Transfers Direct to Backblaze B2 Cloud Storage

Post Syndicated from Jeremy Milk original https://www.backblaze.com/blog/facebook-photo-video-transfers-direct-to-cloud-storage/

Facebook pointing to Backblaze

Perhaps I’m dating myself when I say that I’ve been using Facebook for a very long time. So long that the platform is home to many precious photos and videos that I couldn’t imagine losing. And even though they’re mostly shared to Facebook from my phone or other apps, some aren’t. So I’ve periodically downloaded my Facebook albums to my Mac, which I’ve of course set to automatically back up with Backblaze, to ensure they’re safely archived.

And while it’s good to know how to download and back up your social media profile, you might be excited to learn that it’s just become a lot easier: Facebook has integrated Backblaze B2 Cloud Storage directly as a data transfer destination for your photos and videos. This means you can now migrate or copy years of memories in a matter of clicks.

What Data Transfer Means for You

If you use Facebook and want to exercise even greater control over the media you’ve posted there, you’ll find that this seamless integration enables:

  • Personal safeguarding of images and videos in Backblaze.
  • Enhanced file sharing and access control options.
  • Ability to organize, modify, and collaborate on content.

How to Move Your Data to Backblaze B2

Current Backblaze B2 customers can start data transfers within Facebook via Settings & Privacy > Settings / Your Facebook Information / Transfer a Copy of Your Photos or Videos / Choose Destination / Backblaze.

      1. You can find Settings & Privacy listed in the options when you click your profile icon.
      2. Under Settings & Privacy, select Settings.
      3. Go to Your Facebook Information and select “View” next to Transfer a Copy of Your Photos or Videos.

    Transfer a Copy of Your Photos or Videos

      4. Under Choose Destination, simply select Backblaze and your data transfer will begin.

    Transfer a Copy of Your Photos or Videos to Backblaze

If you don’t have a Backblaze B2 account, you can create one here. You’ll need a Key ID and an Application Key when you select Backblaze.

The Data Transfer Project and B2 Cloud Storage

The secure, encrypted data transfer service is based on code Facebook developed through the open-source Data Transfer Project (and you all know we love open-source projects, from our original Storage Pod design to Reed-Solomon erasure coding). Data routed to your B2 Cloud Storage account enjoys our standard $5/TB month pricing with a standard 10GB of free capacity.

Our Co-Founder and CEO, Gleb Budman, noted that this new integration harkens back to our roots: “We’ve been helping people safely store their photos and videos in our cloud for almost as long as Facebook has been providing the means to post content. For people on Facebook who want more choice in hosting their data outside the platform, we’re happy to make our cloud a seamlessly available destination.”

My take: 👍

The post Announcing Facebook Photo and Video Transfers Direct to Backblaze B2 Cloud Storage appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Backblaze Hard Drive Stats Q3 2020

Post Syndicated from Andy Klein original https://www.backblaze.com/blog/backblaze-hard-drive-stats-q3-2020/

As of September 30, 2020, Backblaze had 153,727 spinning hard drives in our cloud storage ecosystem spread across four data centers. Of that number, there were 2,780 boot drives and 150,947 data drives. This review looks at the Q3 2020 and lifetime hard drive failure rates of the data drive models currently in operation in our data centers and provides a handful of insights and observations along the way. As always, we look forward to your comments.

Quarterly Hard Drive Failure Stats for Q3 2020

At the end of Q3 2020, Backblaze was using 150,974 hard drives to store customer data. For our evaluation we remove from consideration those drive models for which we did not have at least 60 drives (more on that later). This leaves us with 150,757 hard drives in our review. The table below covers what happened in Q3 2020.

Observations on the Q3 Stats

There are several models with zero drive failures in the quarter. That’s great, but when we dig in a little we get different stories for each of the drives.

  • The 18TB Seagate model (ST18000NM000J) has 300 drive days and they’ve been in service for about 12 days. There were no out of the box failures which is a good start, but that’s all you can say.
  • The 16TB Seagate model (ST16000NM001G) has 5,428 drive days which is low, but they’ve been around for nearly 10 months on average. Still, I wouldn’t try to draw any conclusions yet, but a quarter or two more like this and we might have something to say.
  • The 4TB Toshiba model (MD04ABA400V) has only 9,108 drive days, but they have been putting up zeros for seven quarters straight. That has to count for something.
  • The 14TB Seagate model (ST14000NM001G) has 21,120 drive days with 2,400 drives, but they have only been operational for less than one month. Next quarter will give us a better picture.
  • The 4TB HGST (model: HMS5C4040ALE640) has 274,923 drive days with no failures this quarter. Everything else is awesome, but hold on before you run out and buy one. Why? You’re probably not going to get a new one and if you do, it will really be at least three years old, as HGST/WDC hasn’t made these drives in at least that long. If someone from HGST/WDC can confirm or deny that for us in the comments that would be great. There are stories dating back to 2016 where folks tried to order this drive and got a refurbished drive instead. If you want to give a refurbished drive a try, that’s fine, but that’s not what our numbers are based on.

The Q3 2020 annualized failure rate (AFR) of 0.89% is slightly higher than last quarter at 0.81%, but significantly lower than the 2.07% from a year ago. Even with the lower drive failure rates, our data center techs are not bored. In this quarter they added nearly 11,000 new drives totaling over 150PB of storage, all while operating under strict Covid-19 protocols. We’ll cover how they did that in a future post, but let’s just say they were busy.

The Island of Misfit Drives

There were 190 drives (150,947 minus 150,757) that were not included in the Q3 2020 Quarterly Chart above because we did not have at least 60 drives of a given model. Here’s a breakdown:

Nearly all of these drives were used as replacement drives. This happens when a given drive model is no longer available for purchase, but we have many in operation and we need a replacement. For example, we still have three WDC 6TB drives in use; they are installed in three different Storage Pods, along with 6TB drives from Seagate and HGST. Most of these drives were new when they were installed, but sometimes we reuse a drive that was removed from service, typically via a migration. Such drives are, of course, reformatted, wiped, and then must pass our qualification process to be reinstalled.

There are two “new” drives on our list. These are drives that are qualified for use in our data centers, but we haven’t deployed in quantity yet. In the case of the 10TB HGST drive, the availability and qualification of multiple 12TB models has reduced the likelihood that we would use more of this drive model. The 16TB Toshiba drive model is more likely to be deployed going forward as we get ready to deploy the next wave of big drives.

The Big Drives Are Here

When we first started collecting hard drive data back in 2013, a big drive was 4TB, with 5TB and 6TB drives just coming to market. Today, we’ll define big drives as 14TB, 16TB, and 18TB drives. The table below summarizes our current utilization of these drives.

The total of 19,878 represents 13.2% of our operational data drives. While most of these are the 14TB Toshiba drives, all of the above have been qualified for use in our data centers.

For all of the drive models besides the Toshiba 14TB drive, the number of drive days is still too small to conclude anything, although the Seagate 14TB model, the Toshiba 16TB model, and the Seagate 18TB model have experienced no failures to date.

We will continue to add these large drives over the coming quarters and track them along the way. As of Q3 2020, the lifetime AFR for this group of drives is 1.04%, which as we’ll see, is below the lifetime AFR for all of the drive models in operation.

Lifetime Hard Drive Failure Rates

The table below shows the lifetime AFR for the hard drive models we had in service as of September 30, 2020. All of the drive models listed were in operation during this timeframe.
The lifetime AFR as of Q3 2020 was 1.58%, the lowest since we started keeping track in 2013. That is down from 1.73% one year ago, and down from 1.64% last quarter.

We added back the average age column as “Avg Age.” This is in months and is the average age of the drives used to compute the data in the table and is based on the amount of time they have been in operation. One thing to remember is that our environment is very dynamic with drives being added, being migrated, and leaving on a regular basis and this could impact the average age. For example, we could retire a Storage Pod with mostly older drives and that could lower the average age of the remaining drives of that model while those remaining drives got older.

Looking at the average age, the 6TB Seagate drives are the oldest cohort, averaging nearly five and a half years of service each. These drives have actually gotten better over the last couple years and are aging well with a current lifetime AFR of 1.0%.

If you’d like to learn more, join us for a webinar Q&A with the author of Hard Drive Stats, Andy Klein, on October 22, 10:00 a.m. PT.

The Hard Drive Stats Data

The complete data set used to create the information used in this review is available on our Hard Drive Test Data webpage. You can download and use this data for free for your own purpose. All we ask are three things: 1) You cite Backblaze as the source if you use the data, 2) You accept that you are solely responsible for how you use the data, and 3) You do not sell this data to anyone—it is free.

If you just want the summarized data used to create the tables and charts in this blog post, you can download the ZIP file containing the MS Excel spreadsheet.

Good luck and let us know if you find anything interesting.

The post Backblaze Hard Drive Stats Q3 2020 appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Enhanced Ransomware Protection: Announcing Data Immutability With Backblaze B2 and Veeam

Post Syndicated from Natasha Rabinov original https://www.backblaze.com/blog/object-lock-data-immutability/

Protecting businesses and organizations from ransomware has become one of the most, if not the most, essential responsibilities for IT directors and CIOs. Ransomware attacks are on the rise, occuring every 14 seconds, but you likely already know that. That’s why a top requested feature for Backblaze’s S3 Compatible APIs is Veeam® immutability—to increase your organization’s protection from ransomware and malicious attacks.

We heard you and are happy to announce that Backblaze B2 Cloud Storage now supports data immutability for Veeam backups. It is available immediately.

The solution, which earned a Veeam Ready-Object with Immutability qualification, means a good, clean backup is just clicks away when reliable recovery is needed.

It is the only public cloud storage alternative to Amazon S3 to earn Veeam’s certifications for both compatibility and immutability. And it offers this at a fraction of the cost.

“I am happy to see Backblaze leading the way here as the first cloud storage vendor outside of AWS to give us this feature. It will hit our labs soon, and we’re eager to test this to be able to deploy it in production.”—Didier Van Hoye, Veeam Vanguard and Technology Strategist

Using Veeam Backup & Replication™, you can now simply check a box and make recent backups immutable for a specified period of time. Once that option is selected, nobody can modify, encrypt, tamper with, or delete your protected data. Recovering from ransomware is as simple as restoring from your clean, safe backup.

Freedom From Tape, Wasted Resources, and Concern

Prevention is the most pragmatic ransomware protection to implement. Ensuring that backups are up-to-date, off-site, and protected with a 3-2-1 strategy is the industry standard for this approach. But up to now, this meant that IT directors who wanted to create truly air-gapped backups were often shuttling tapes off-site—adding time, the necessity for on-site infrastructure, and the risk of data loss in transit to the process.

With object lock functionality, there is no longer a need for tapes or a Veeam virtual tape library. You can now create virtual air-gapped backups directly in the capacity tier of a Scale-out Backup Repository (SOBR). In doing so, data is Write Once, Read Many (WORM) protected, meaning that even during the locked period, data can be restored on demand. Once the lock expires, data can safely be modified or deleted as needed.

Some organizations have already been using immutability with Veeam and Amazon S3, a storage option more complex and expensive than needed for their backups. Now, Backblaze B2’s affordable pricing and clean functionality mean that you can easily opt in to our service to save up to 75% off of your storage invoice. And with our Cloud to Cloud Migration offers, it’s easier than ever to achieve these savings.

In either scenario, there’s an opportunity to enhance data protection while freeing up financial and personnel resources for other projects.

Backblaze B2 customer Alex Acosta, Senior Security Engineer at Gladstone
Institutes
—an independent life science research organization now focused on fighting COVID-19—explained that immutability can help his organization maintain healthy operations. “Immutability reduces the chance of data loss,” he noted, “so our researchers can focus on what they do best: transformative scientific research.”

Enabling Immutability

How to Set Object Lock:

Data immutability begins by creating a bucket that has object lock enabled. Then within your SOBR, you can simply check a box to make recent backups immutable and specify a period of time.

What Happens When Object Lock Is Set:

The true nature of immutability is to prevent modification, encryption, or deletion of protected data. As such, selecting object lock will ensure that no one can:

  • Manually remove backups from Capacity Tier.
  • Remove data using an alternate retention policy.
  • Remove data using lifecycle rules.
  • Remove data via tech support.
  • Remove by the “Remove deleted items data after” option in Veeam.

Once the lock period expires, data can be modified or deleted as needed.

Getting Started Today

With immutability set on critical data, administrators navigating a ransomware attack can quickly restore uninfected data from their immutable Backblaze backups, deploy them, and return to business as usual without painful interruption or expense.

Get started with improved ransomware protection today. If you already have Veeam, you can create a Backblaze B2 account to get started. It’s free, easy, and quick, and you can begin protecting your data right away.

The post Enhanced Ransomware Protection: Announcing Data Immutability With Backblaze B2 and Veeam appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

How iconik Built a Multi-Cloud SaaS Solution

Post Syndicated from Tim Child original https://www.backblaze.com/blog/how-iconik-built-a-multi-cloud-saas-solution/

This spotlight series calls attention to developers who are creating inspiring, innovative, and functional solutions with cloud storage. This month, we asked Tim Child, Co-founder of iconik, to explain the development of their cloud-based content management and collaboration solution.

How iconik Built a Multi-Cloud SaaS

The Challenge:

Back when we started designing iconik, we knew that we wanted to have a media management system that was hugely scalable, beyond anything our experienced team had seen before.

With a combined 50 years in the space, we had worked with many customer systems and not one of them was identical. Each customer had different demands for what systems should offer—whether it was storage, CPU, or database—and these demands changed over the lifecycle of the customer’s needs. Change was the only constant. And we knew that systems that couldn’t evolve and scale couldn’t keep up.

Identifying the Needs:

We quickly realized that we would need to meet constantly changing demands on individual parts of the system and that we needed to be able to scale up and down capabilities at a granular level. We wanted to have thousands of customers with each one potentially having hundreds of thousands, if not millions, of assets on the same instance, leading to the potential for billions of files being managed. We also wanted to have the flexibility to run private instances for customers if they so demanded.

With these needs in mind, we knew our service had to be architected and built to run in the cloud, and that we would run the business as a SaaS solution.

Mapping Our Architecture

Upon identifying this challenge, we settled on using a microservices architecture with each functional unit broken up and then run in Docker containers. This provided the granularity around functions that we knew customers would need. This current map of iconik’s architecture is nearly identical to what we planned from the start.

To manage these units while also providing for the scaling we sought, the architecture required an orchestration layer. We decided upon Kubernetes, as it was:

  • A proven technology with a large, influential community supporting it.
  • A well maintained open-source orchestration platform.
  • A system that functionally supported what we needed to do while also providing the ability to automatically scale, distribute, and handle faults for all of our containers.

During this development process, we also invested time in working with leading cloud IaaS and PaaS providers, in particular both Amazon AWS and Google Cloud, to discover the right solutions for production systems, AI, transcode, CDN, Cloud Functions, and compute.

Choosing a Multi-Cloud Approach

Based upon the learnings from working with a variety of cloud providers, we decided that our strategy would be to avoid being locked into any one cloud vendor, and instead pursue a multi-cloud approach—taking the best from each and using it to our customers’ advantage.

As we got closer to launching iconik.io in 2017, we started looking at where to run our production systems, and Google Cloud was clearly the winner in terms of their support for Kubernetes and their history with the project.

Looking at the larger picture, Google Cloud also had:

  • A world-class network with a presence of 93+ points in 64 global regions.
  • BigQuery, with its on-demand pricing, advanced scalability features, and ease of use.
  • Machine learning and AI tools that we had been involved in beta testing before they were built in, and which would provide an important element in our offering to give deep insights around media.
  • APIs that were rock solid.

These important factors became the deciding points on launching with Google Cloud. But, moving forward, we knew that our architecture would not be difficult to shift to another service if necessary as there was very little lock-in for these services. In fact, the flexibility provided allows us to run dedicated deployments for customers on their cloud platform of choice and even within their own virtual private cloud.

Offering Freedom of Choice for Storage

With our multi-cloud approach in mind, we wanted to bring the same flexibility we developed in production systems to our storage offering. Google Cloud Services was a natural choice because it was native to our production systems platform. From there, we grew options in line with the best fit for our customers, either based on their demands or based on what the vendor could offer.

From the start, we supported Amazon S3 and quickly brought Backblaze B2 Cloud Storage on board. We also allowed our customers to use their own Buckets to be truly in charge of their files. We continued to be led by the search for maximum scalability and flexibility to change on the fly.

While a number of iconik customers use B2 Cloud Storage or Amazon S3 as their only storage solution, many also take a multiple vendor approach because it can best meet their needs either in terms of risk management, delivery of files, or cost management.

Credit iconik, learn more in their Q2 2020 Media Stats report.


As we have grown, our multi-cloud approach has allowed us to onboard more services from Amazon—including AI, transcode, CDN, Cloud Functions, and compute for our own infrastructure. In the future, we intend to do the same with Azure and with IBM. We encourage the same for our customers as we allow them to mix and match AWS, Backblaze, GCS, IBM, and Microsoft Azure to match their strategy and needs.

Reaping the Benefits of a Multi-Cloud Solution

To date, our cloud-agnostic approach to building iconik has paid off.

  • This year, when iconik’s asset count increased by 293% to over 28M assets, there was no impact on performance.
  • As new technology has become available, we have been able to improve a single section of our architecture without impacting other parts.
  • By not limiting cloud services that can be used in iconik, we have been able to establish many rewarding partnerships and accommodate customers who want to keep the cloud services they already use.

Hopefully our story can help shed some light to help any others who are venturing out to build a SaaS of their own. We wish you luck!

The post How iconik Built a Multi-Cloud SaaS Solution appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Oslo by Streamlabs, Collaboration for Creatives

Post Syndicated from Andy Klein original https://www.backblaze.com/blog/oslo-journey-to-market/

Backblaze Partner Profile - Steamlabs Oslo
Oslo by Streamlabs, Collaboration for Creatives

With a mission of empowering creatives, the team at Streamlabs was driven to follow up their success in live streaming by looking beyond the stream—and so, Oslo was born.

Oslo, generally available as of today, is the place where solo YouTubers and small video editing teams can upload, review, collaborate, and share videos. But, the road from Streamlabs to Oslo wasn’t a straight line. The intrepid team from Streamlabs had to muddle through painfully truthful market research, culture shock, an affordability dilemma, and a pandemic to get Oslo into the hands of their customers. Let’s take a look at how they did it.

Market Research and the Road to Oslo

In September 2019, Streamlabs was acquired by Logitech. Yes, that Logitech, the one who makes millions of keyboards and mice, and all kinds of equipment for gamers. That Logitech acquired a streaming company. Bold, different, and yet it made sense to nearly everyone, especially anyone in the video gaming industry. Gamers rely on Logitech for a ton of hardware, and many of them rely on Streamlabs to stream their gameplay on Twitch, YouTube, and Facebook.

About the same time, Ashray Urs, Head of Product at Streamlabs, and his team were in the middle of performing market research and initial design work on their next product: video editing software for the masses. And what they were learning from the market research was disconcerting. While their target audience thought it would be awesome if Streamlabs built a video editor, the market was already full of them and nearly everybody already had one, or two, or even three editing tools on hand. In addition, the list of requirements to build a video editor was daunting, especially for Ashray and his small team of developers.

The future of Oslo was looking bleak when a fork in the road appeared. While video editing wasn’t a real pain point, many solo creators and small video editing teams were challenged and often overwhelmed by a key function in any project: collaboration. Many of these creators spent more time sending emails, uploading and downloading files, keeping track of versions and updates, and managing storage instead of being creative. Existing video collaboration tools were expensive, complex, and really meant for larger teams. Taking all this in, Ashray and his team decided on a different road to Oslo. They would build a highly affordable, yet powerful, video collaboration and sharing service.

Oslo collaboration view screenshot

Culture Shock: Hardware Versus Software

As the Oslo project moved forward, a different challenge emerged for Ashray: communicating their plans and processes for the Oslo service to their hardware oriented parent company, Logitech.

For example, each thought quite differently about the product release process. Oslo, as a SaaS service could, if desired, update their product daily to all their customers, and they could add new features and new upsells in weeks or maybe months. Logitech’s production process, on the other hand, was oriented toward having everything ready so they could make a million units of a keyboard. With the added challenge of not having an “update now” button on those keyboards.

Logitech was not ignorant of software, having created and shipped device drivers, software tools, and other utilities. But to them, the Oslo release process felt like a product launch on steroids. This is the part in the story where the bigger company tells the little company they have to do things “our” way. And it would have been stereotypically “corporate” for Logitech to say no to Oslo, then bury it in the backyard and move on. Instead, they gave the project the green light and fully supported Ashray and his team as they moved forward.

Oslo - New Channel - Daily Vlogs

Backblaze B2 Powers Affordability

As the feature requirements around Oslo began to coalesce, attention turned to how Oslo would deliver on the goal to provide them at an affordable price. After all, solo YouTubers and small video teams were not known to have piles of money to spend on tools. The question became moot when they chose Backblaze B2 Cloud Storage as their storage vendor.

To start, Backblaze enabled Oslo to meet the pricing targets they had determined were optimal for their market. Choosing any of the other leading cloud storage vendors would have doubled or even tripled the subscription price of Oslo. That would have made Oslo a non-starter for much of its target audience.

On the cost side, many of the other cloud storage providers have complex or hidden terms, like charging for files you delete if you don’t keep them around long enough—30 day minimum for some vendors, 90 day minimum for others. Ashray had no desire to explain to customers that they had to pay extra for deleted files, nor did he want to explain to his boss why 20% of the cloud storage costs for the Oslo service were for deleted files. With Backblaze he didn’t have to do either, as each day Oslo’s data storage charges are based on the files they currently have stored, and not for files they deleted 30, 60, or even 89 days ago.

On the features side, the Backblaze B2 Native APIs enabled Oslo to implement their special upload links feature which allows collaborators to add files directly into a specific project. As the project editor, this feature allows you to send upload links to collaborators that they can use to upload files. The links can be time-based—e.g. good for 24 hours—and password protected, if desired.

Travel Recap video image collage

New Product Development in a Pandemic

About the time the Oslo team was ready to start development, they were sent home as their office closed due to the Covid-19 pandemic. The whiteboards full of flow charts, UI diagrams, potential issues, and more essential information were locked away. Ad hoc discussions and decisions from hallway encounters, lunchroom conversations, and cups of tea with colleagues stopped.

The first few days were eerie and uncertain, but like many other technology companies they began to get used to their new work environment. Yes, they had the advantage of being technologically capable as meeting apps, collaboration services, and messaging systems were well within their grasp, but they were still human. While it took some time to get into the work from home groove, they were able to develop, QA, run a beta program, and deliver Oslo, without a single person stepping back in the office. Impressive.

Oslo 1.0

Every project, software, hardware, whatever, has some twists and turns as you go through the process. Oslo could have been just another video editing service, could have cost three times as much, or could have been one more cancelled project due to Covid-19. Instead, the Oslo team delivered YouTubers and the like an affordable video collaboration and sharing service with lots of cool features aimed at having them spend less time being project managers and more time being creators.

Nice job, we’re glad Backblaze could help. You can get the full scoop about Oslo at oslo.io.

The post Oslo by Streamlabs, Collaboration for Creatives appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Simplifying Complex: A Multi-Cloud Approach to Scaling Production

Post Syndicated from Lora Maslenitsyna original https://www.backblaze.com/blog/simplifying-complex-a-multi-cloud-approach-to-scaling-production/

How do you grow your production process without missing a beat as you evolve over 20 years from a single magazine to a multichannel media powerhouse? Since there are some cool learnings for many of you, here’s a summary of our recent case study deep dive into Verizon’s Complex Networks.

Founders Marc Eckō of Eckō Unlimited and Rich Antoniello started Complex in 2002 as a bi-monthly print magazine. Over almost 20 years, they’ve grown to produce nearly 50 episodic series in addition to monetizing more than 100 websites. They have a huge audience reaching 21 billion lifetime views and 52.2 million YouTube subscribers with premium distributors including Netflix, Hulu, Corus, Facebook, Snap, MSG, Fuse, Pluto TV, Roku, and more. Their team of creatives produce new content constantly—covering everything from music to movies, sports to video games, and fashion to food—which means that production workflows are the pulse of what they do.

Looking for Data Storage During Constant Production

In 2016, the Complex production team was expanding rapidly, with recent acquisitions bringing on multiple new groups that all had their own workflows. They used a Terrablock by Facilis and a few “homebrewed solutions,” but there was no unified, central storage location, and they were starting to run out of space. As many organizations with tons of data and no space do, they turned to Amazon Glacier.

There were problems:

  • Visibility: They started out with Glacier Vault, but with countless hours of good content, they constantly needed to access their archive—which required accessing the whole thing just to see what was in there.
  • Accessibility: An upgrade to S3 Glacier made their assets more visible, but retrieving those assets still involved multiple steps, various tools, and long retrieval times—sometimes ranging to 12 hours.
  • Complexity: S3 has multiple storage classes, each with its own associated costs, fees, and wait times.
  • Expense: The worst of the issue was that this glacial process didn’t just slow down production, it also incurred huge expenses through egress charges.

The worst thing was, staff would wade through this process only to sometimes realize that the content sent back to them wasn’t what they were looking for. The main issue for the team was that they struggled to see all of their storage systems clearly.

Organizing Storage With Transparent Asset Management

They resolved to fix the problem once and for all by investing in three areas:

  • Empower their team to collaborate and share at the speed of their work.
  • Identify tools that would scale with their team instantaneously.
  • Incorporate off-site storage that mimicked their on-site solutions’ scaling and simplicity.

To remedy their first issue, they set up a centralized SAN—a Quantum StorNext—that allowed the entire team to work on projects simultaneously.

Second, they found iconik, which moved them away from the inflexible on-prem integration philosophies of legacy MAM systems. Even better, they could test-run iconik before committing.

Finally, because iconik is integrated with Backblaze B2 Cloud Storage, the team at Complex decided to experiment with a B2 Bucket. Backblaze B2’s pay-as-you-go service with no upload fees, no deletion fees, and no minimum data size requirements fit the philosophy of their approach.

There was one problem: It was easy enough to point new projects toward Backblaze B2, but they still had petabytes of data they’d need to move to fully enable this new workflow.

Setting Up Active Archive Storage

The post and studio operations and media infrastructure and technology teams estimated that they would have to copy at least 550TB of 1.5PB of data from cold storage for future distribution purposes in 2020. Backblaze partners were able to help solve the problem.

Flexify.IO uses cloud internet connections to achieve significantly faster migrations for large data transfers. Pairing Flexify with a bare-metal cloud services platform to set up metadata ingest servers in the cloud, Complex was able to migrate to B2 Cloud Storage directly with their files and file structure intact. This allowed them to avoid the need to pull 550TB of assets into local storage just to ingest assets and make proxy files.

More Creative Possibilities With a Flexible Workflow

Now, Complex Networks is free to focus on creating new content with lightning-fast distribution. Their creative team can quickly access 550TB of archived content via proxies that are organized and scannable in iconik. They can retrieve entire projects and begin fresh production without any delays. “Hot Ones,” “Sneaker Shopping,” and “The Burger Show”—the content their customers like to consume, literally and figuratively, is flowing.

Is your business facing a similar challenge?

The post Simplifying Complex: A Multi-Cloud Approach to Scaling Production appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

The Path to S3 Compatible APIs: The Authentication Challenge

Post Syndicated from Malay Shah original https://www.backblaze.com/blog/the-path-to-s3-compatible-apis-the-authentication-challenge/

We launched our Backblaze S3 Compatible APIs in May of 2020 and released them for GA in July. After a launch, it’s easy to forget about the hard work that made it a reality. With that in mind, we’ve asked Malay Shah, our Senior Software Engineering Manager, to explain one of the challenges he found intriguing in the process. If you’re interested in developing your own APIs, or just curious about how ours have come to be, we think you’ll find Malay’s perspective interesting.

When we started building our Backblaze S3 Compatible APIs, we already had Backblaze B2 Cloud Storage, so the hard work to create a durable, scalable, and highly performant object store was already done. And B2 was already conceptually similar to S3, so the task seemed far from impossible. That’s not to say that it was easy or without any challenges. There were enough differences between the B2 Native APIs and the S3 API to make the project interesting, and one of those is authentication. In this post, I’m going to walk you through how we approached the challenge of authentication in our development of Backblaze S3 Compatible APIs.

The Challenge of Authentication: S3 vs. B2 Cloud Storage

B2 Cloud Storage’s approach to authentication is login/session based, where the API key ID and secret are used to log in and obtain a session ID, which is then provided on each subsequent request. S3 requires each individual request to be signed using the key ID and secret.

Our login/session approach does not require storing the API key secret on our end, only a hash of it. As a result, any compromise of our database would not allow hackers to impersonate customers and access their data. However, this approach is susceptible to “man-in-the-middle” attacks. Capturing the login request (API call to b2_authorize_account) would reveal the API key ID and secret to the attacker; capturing subsequent requests would reveal the session ID which is valid for 24 hours. Either of these would allow a hacker to impersonate a customer, which is clearly not a good thing. That said, our system and basic data safety practices will protect users. First, it is important to maintain your trusted certificate list. Our APIs are only available over HTTPS, and HTTPS in conjunction with a well managed trusted certificate list mitigates the likelihood of a “man-in-the-middle” attack.

Amazon’s approach with S3 requires their backend to store the secret because authenticating a request requires the backend to replicate the request signing process for each call. As a result, request signing is much less susceptible to a “man-in-the-middle” attack. The most any bad actor could do is replay the request; a hacker would not be able to impersonate the customer and make other requests. However, compromising the systems that store the API key secret would allow impersonation of the customer. This risk is typically mitigated by encrypting the API key secret and storing that key somewhere else, thus requiring multiple systems to be compromised.

Both approaches are common patterns for authentication, each with their own strengths and risks.

Storing the API Key Secret

To implement AWS’s request signing in our system, we first needed to figure out how to store the API key secret. A compromise of our database by a hacker who has obtained the hash of the secret for B2 does not allow that hacker to impersonate customers, but if we stored the secret itself, it absolutely would. So we couldn’t store the secret alongside the other application key data. We needed another solution, and it needed to handle the number of application keys we have (millions) and the volume of API requests we service (hundreds of thousands per minute), without slowing down requests or adding additional risks of failure.

Our solution is to encrypt the secret and store that alongside the other application key data in our database. The encryption key is then kept in a secrets management solution. The database already supports the volume of requests we service and decrypting the secret is computationally trivial, so there is no noticeable performance overhead.

With this approach, a compromise of the database alone would only reveal the encrypted version of the secret, which is just as useless as having the hash. Multiple systems must be compromised to obtain the API key secret.

Implementing the Request Signing Algorithm

We chose to only implement AWS’s Signature Version 4 as Version 2 is deprecated and is not allowed for use on newly created buckets. Within Version 4, there are multiple ways to sign the request: sign only the headers, sign the whole request, sign individual chunks, and pre-signed URLs. All of these follow a similar pattern but differ enough to warrant individual consideration for testing. We absolutely needed to get this right so we tested authentication in many ways:

  • Ran through Amazon’s test suite of example requests and expected signatures
  • Tested 20 applications that work with Backblaze S3 Compatible APIs including Veeam and Synology
  • Ran Ceph’s S3-tests suite
  • Manually tested using the AWS command line interface
  • Manually tested using Postman
  • Built automated tests using both the Python and Java SDKs
  • Made HTTP requests directly to test cases not possible through the Python or Java SDKs
  • Hired hackers security researchers to break our implementation

With the B2 Native API authentication model, we can verify authentication by examining the “Authorization” header and only then move on to processing the request, but S3 requests—where the whole request is signed or uses signed chunks—can only be verified after reading the entire request body. For most of the S3 APIs, this is not an issue. The request bodies can be read into memory, verified, and then continue on to processing. However, for file uploads, the request body can be as large as 5GB—far too much to store in memory—so we reworked our uploading logic to handle authentication failures occurring at the end of the upload and to only record API usage after authentication passes.

The different ways to sign requests meant that in some cases we have to verify the request after the headers arrive, and in other cases verify only after the entire request body is read. We wrote the signature verification algorithm to handle each of these request types. Amazon had published a test suite (which is now no longer available, unfortunately) for request signing. This test suite was designed to help people call into the Amazon APIs, but due to the symmetric nature of the request signing process, we were able to use it as well to test our server-side implementation. This was not an authoritative or comprehensive test suite, but it was a very helpful starting point. As was the AWS command line interface, which in debug mode will output the intermediate calculations to generate the signature, namely the canonical request and string to sign.

However, when we built our APIs on top of the signature validation logic, we discovered that our APIs handled reading the request body in different ways, leading to some APIs succeeding without verifying the request, yikes! So there were even more combinations that we needed to test, and not all of these combinations could be tested using the AWS software development kits (SDKs).

For file uploads, the SDKs only signed the headers and not the request body—a reasonable choice for file uploads. But as implementers, we must support all legal requests so we made direct HTTP requests to verify whole request signing and signed chunk requests. There’s also instrumentation now to ensure that all successful requests are verified.

Looking Back

We expected this to be a big job, and it was. Testing all the corner cases of request authentication was the biggest challenge. There was no single approach that covered everything; all of the above items tested different aspects of authentication. Having a comprehensive and multifaceted testing plan allowed us to find and fix issues we would have never thought of, and ultimately gave us confidence in our implementation.

The post The Path to S3 Compatible APIs: The Authentication Challenge appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Increasing Thread Count: Useful in Sheets and Cloud Storage Speed

Post Syndicated from Troy Liljedahl original https://www.backblaze.com/blog/increasing-thread-count-useful-in-sheets-and-cloud-storage-speed/

As the Solutions Engineering Manager, I have the privilege of getting to work with amazing customers day in and day out to help them find a solution that fits their needs. For Backblaze B2 Cloud Storage, this includes helping them find an application to manage their media—like iconik, or setting up their existing infrastructure with our Backblaze S3 Compatible APIs, or even providing support for developers writing to our B2 Native APIs.

But regardless of the solution, one of the most common questions I get when talking with these customers is, “How do I maximize my performance to Backblaze??” People want to go fast. And the answer almost always comes down to: threads.

What Are Threads?

If you do not know what a thread is used for besides sewing, you’re not alone. First of all, threads go by many different names. Different applications may refer to them as streams, concurrent threads, parallel threads, concurrent upload, multi-threading, concurrency, parallelism, and likely some other names I haven’t come across yet.

But what all these terms refer to when we’re discussing B2 Cloud Storage is the process of uploading files. When you begin to transmit files to Backblaze B2, they are being communicated by threads. (If you’re dying for an academic description of threads, feel free to take some time with this post we wrote about them). Multithreading, not surprisingly, is the ability to upload multiple files (or multiple parts of one file) at the same time. It won’t shock you to hear that many threads are faster than one thread. The good news is that B2 Cloud Storage is built from the ground up to take advantage of multithreading—it is able to take as many threads as you can throw at it for no additional charge and your performance should scale accordingly. But it does not automatically do so, for reasons we’ll discuss right now.

Fine-tuning

Of course, this begs the question, why not just turn everything up to one million threads for UNLIMITED POWER!!!!????

Well, chances are your device can’t handle or take advantage of that many threads. The more threads you have open, the more taxing it will be on your device and your network, so it often takes some trial and error to find the sweet spot to get optimal performance without severely affecting the usability of your device.

Try adding more threads and see how the performance changes after you’ve uploaded for a while. If you see improvements in the upload rate and don’t see any performance issues with your device, then try adding some more and repeating the process. It might take a couple of tries to figure out the optimal number of threads for your specific environment. You’ll be able to rest assured that your data is moving at optimal power (not quite as intoxicating as unlimited power, but you’ll thank me when your computer doesn’t just give up).

How To Increase Your Thread Count

Some applications will take the guesswork out of this process and set the number of threads automatically (like our Backblaze Personal Backup and Backblaze Business Backup client does for users) while others will use one thread unless you say otherwise. Each application that works with B2 Cloud Storage treats threads a little differently. So we’ve included a few examples of how to adjust the number of threads in the most popular applications that work with B2 Cloud Storage—including Veeam, rclone, and SyncBackPro.

If you’re struggling with slow uploads in any of the many other integrations we support, check out our knowledge base to see if we offer a guide on how to adjust the threading. You can also reach out to our support team 24/7 via email for assistance in finding out just how to thread your way to the ultimate performance with B2 Cloud Storage.

Veeam

This one is easy—Veeam automatically uses up to 64 threads per VM (not to be confused with “concurrent tasks”) when uploading to Backblaze B2. To increase threading you’ll need to use per-VM backup files. You’ll find Veeam-recommended settings in the Advanced Settings of the Performance Tier in the Scale-out Repository. (See screenshot below).

Rclone

Rclone allows you to use the --transfers flag to adjust the number of threads up from the default of four. Rclone’s developer team has found that their optimal setting was --transfers 32, but every configuration is going to be different so you may find that another number will work better for you.

rclone sync /Users/Troy/Downloads b2:troydemorclone/downloads/ --transfers 20

Tip: If you like to watch and see how fast each file is uploading, use the --progress (or -P) flag and you’ll see the speeds of each upload thread!

SyncBackPro

SyncBackPro is an awesome sync tool for Windows that supports Backblaze B2 as well the ability to only sync deltas of a file (the parts that have changed). SyncBackPro uses threads in quite a few places across its settings, but the part that concerns how many concurrent threads will upload to Backblaze B2 is in the “Number of upload/download threads to use” setting. You can find this in the Cloud Settings under the Advanced tab. You’ll notice they even throw in a warning letting you know that too many will degrade performance!

Happy Threading!

I hope this guide makes working with B2 Cloud Storage a little faster and easier for you. If you’re able to make these integrations work for your use case, or you’ve already got your threading perfectly calibrated, we’d love to hear about your experience and learnings in the comments.

The post Increasing Thread Count: Useful in Sheets and Cloud Storage Speed appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Backblaze Hard Drive Stats Q2 2020

Post Syndicated from Andy Klein original https://www.backblaze.com/blog/backblaze-hard-drive-stats-q2-2020/

Backblaze Drive Stats Q2 2020

As of June 30, 2020, Backblaze had 142,630 spinning hard drives in our cloud storage ecosystem spread across four data centers. Of that number, there were 2,271 boot drives and 140,059 data drives. This review looks at the Q2 2020 and lifetime hard drive failure rates of the data drive models currently in operation in our data centers and provides a handful of insights and observations along the way. As always, we look forward to your comments.

Quarterly Hard Drive Failure Stats for Q2 2020

At the end of Q2 2020, Backblaze was using 140,059 hard drives to store customer data. For our evaluation we remove from consideration those drive models for which we did not have at least 60 drives (see why below). This leaves us with 139,867 hard drives in our review. The table below covers what happened in Q2 2020.

Backblaze Q2 2020 Annualized Hard Drive Failure Rates Table

Notes and Observations

The Annualized Failure Rate (AFR) for Q2 2020 was 0.81% versus Q1 2020 which was 1.07%. The Q2 AFR number is the lowest AFR for any quarter since we started keeping track in 2013. In addition, this is the first time the quarterly AFR has been under 1%. One year ago (Q2 2019), the quarterly AFR was 1.8%.

During this quarter, three drive models had 0 (zero) drive failures: the Toshiba 4TB (model: MD04ABA400V), the Seagate 6TB (model: ST6000DX000) and the HGST 8TB (model: HUH728080ALE600). While the Toshiba 4TB drives recorded less than 10,000 drive days, we have not had a drive failure for that model since Q4 2018, or 54,054 drive days. In comparing drive days with the Toshiba drive, the Seagate 6TB and HGST 8TB drives are just as impressive, having no failures in the quarter yet recording 80,626 and 91,000 drive days respectively in Q2 2020.

There were 192 drives (140,059 minus 139,867) that were not included in the list above because we did not have at least 60 drives of a given model. For example, we have: 20 Toshiba 16TB drives (model: MG08ACA16TA) we are putting through our certification process. On the other end of the spectrum, we still have 25 HGST 4TB drives (model: HDS5C4040ALE630), putting in time in Storage Pods. Observant readers might note the model number of those HGST drives and realize they were the last of the drives produced with Hitachi model numbers.

Reminiscing aside, when we report quarterly, yearly, or lifetime drive statistics, those models with less than 60 drives are not included in the calculations or graphs. We use 60 drives as a minimum as there are 60 drives in all newly deployed Storage Pods. Note: The Seagate 16TB drive (model: ST16000NM001G) does show 59 drives and is listed in the report because the one failed drive had not been replaced at the time the data for this report was collected.

That said, all the data from all of the drive models, including boot drives, is included in the files which can be accessed and downloaded on our Hard Drive Test Data webpage.

What We Deployed in Q2

We deployed 12,063 new drives and removed 1,960 drives via replacements and migration in Q2, giving us a net of 10,103 added drives. Below is a table of the drive models we deployed.

Table: Drives Deployed in Q2 2020

Quarterly Trends by Manufacturer

Quarterly data is just that, data for only that quarter. At the beginning of each quarter we wipe out all the previous data and we start compiling new information. At the end of the quarter, we bundle that data up into a unit (collection, bag, file, whatever), and name it; Q2 2020, for example. This is the type of data you were looking at when you reviewed the Quarterly Chart for Q2 2020 shown earlier in this report. We can also compare the results for a given quarter to other quarters, each their own unique bundle of data. This type of comparison can reveal trends that can help us identify something that needs further attention.

The chart below shows the AFR by manufacturer using quarterly data over the last three years. Following the chart is two tables. The first is the data used to create the chart. The second is the count of the number of hard drives corresponding to each quarter for each manufacturer.

Backblaze Quarterly Annualized Hard Drive Failure Rates by Manufacturer Chart
Quarterly Annualized Hard Drive Failure Rates and Drive Count by Manufacturer Tables

Notes

    1. 1. The data for each manufacturer consists of all the drive models in service which were used to store customer data. There were no boot drives or test drives included.
    1. 2. The 0.00% values for the Toshiba drives from Q3 2017 through Q3 2018 are correct. There were no Toshiba drive failures during that period. Note, there were no more than 231 drives in service at any one time during that same period. While zero failures over five quarters is notable, the number of drives is not high enough to reach any conclusions.
    1. 3. The “n/a” values for the WDC drives from Q2 2019 onward indicate there were zero WDC drives being used for customer data in our system during that period. This does not consider the newer HGST drive models branded as WDC as we do not currently have any of those models in operation.

Observations

    1. 1. WDC: The WDC data demonstrate how having too few data points (i.e. hard drives) can lead to a wide variance in quarter to quarter comparisons.
    1. 2. Toshiba: Just like the WDC data, the number of Toshiba hard drives for most of the period is too low to reach any decent conclusions, but beginning in Q4 2019, that changes and the data from then on is more reliable.
    1. 3. Seagate: After a steady rise in AFR, the last two quarters have been kind to Seagate, with the most recent quarter (AFR = 0.90%) being the best we have ever seen from Seagate since we started keeping stats back in 2013. Good news and worthy of a deeper look over the coming months.
    1. 4. HGST: With the AFR fluctuating between 0.36% and 0.61%, HGST drives win the prize for predictability. Boring, yes, but a good kind of boring.

Cumulative Trends by Manufacturer

As opposed to quarterly data, cumulative data starts collecting data at a given point and new data is added until you stop collecting. While quarterly data reflects the events that took place during a given quarter, cumulative data is everything about our collection of hard drives over time. Using cumulative data, we can see longer term trends over the period, as in the chart below, with the data table following.

Backblaze Cumulative Annualized Hard Drive Failure Rates by Manufacturer Chart
Cumulative Annualized Hard Drive Failure Rates by Manufacturer Table

Down and to the Right

For all manufacturers, you can see a downward trend in AFR over time. While this is a positive occurrence, we do want to understand why and incorporate those learnings into our overall understanding of our environment—just like drive failure, drive “non-failure” matters too. As we consider these findings, if you have any thoughts on the subject, let us know in the comments. Maybe you think hard drives are getting better, or is it more likely that we’ve added so many new drives in the last three years that they dominate the statistics, or is it something else? Let us know.

Lifetime Hard Drive Failure Rates

The table below shows the lifetime AFR for the hard drive models we had in service as of June 30, 2020. The reporting period is from April 2013 through June 30, 2020. All of the drives listed were installed during this timeframe.

Backblaze Lifetime Annualized Hard Drive Failure Rates

Notes and Observations

The lifetime AFR was 1.64%, the lowest since we started keeping track in 2013. In addition, the lifetime AFR has fallen from 1.86% in Q2 2018 to the current value, even as we’ve passed milestones like an exabyte of storage under management, opening a data center in Amsterdam, and nearly doubling the size of the company. A busy two years.

All of the Seagate 12TB drives (model: ST12000NM001G) were installed in Q2, so while we have a reasonable amount of data, as a group these drives are still early in their lifecycle. While not all models follow the bathtub curve as they age, we should wait another couple of quarters to see how they are performing in our environment.

The Seagate 4TB drives (model: ST4000DM000) keep rambling along. With an average age of nearly five years, they are long past their warranty period (one or two years depending on when they were purchased). Speaking of age, the drive model with the highest average age on the chart is the Seagate 6TB drive at over 64 months. That same model had zero failures in Q2 2020, so they seem to be aging well.

The Hard Drive Stats Data

The complete data set used to create the information used in this review is available on our Hard Drive Test Data webpage. You can download and use this data for free for your own purpose. All we ask are three things: 1) You cite Backblaze as the source if you use the data, 2) You accept that you are solely responsible for how you use the data, and 3) You do not sell this data to anyone—it is free.

If you just want the summarized data used to create the tables and charts in this blog post you can download the ZIP file containing the MS Excel spreadsheet.

Good luck and let us know if you find anything interesting.

The post Backblaze Hard Drive Stats Q2 2020 appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

What Is an API?

Post Syndicated from Nicole Perry original https://www.backblaze.com/blog/what-is-an-api/

What is an API?

Driving on 101 in San Francisco (the highway that runs between San Francisco and San Jose that connects most of what people know as “Silicon Valley”) you see a lot of different billboards for tech-related business solutions. One that stood out the first time I saw it was a sign that stated “Learn How to Use Your API Correctly,” and of course my first instinct was to search, “What is an API?” to answer the question for myself.

Things have come a long way since then, but in recent months here at Backblaze, it became clear that understanding exactly how APIs work was very important. And yet, on the verge of our biggest launch in years—our Backblaze S3 Compatible APIs—I realized that I was far from alone in not fully getting the functionality of APIs.

If you’re thinking about onboarding some new tools in your home or office that you hope will play well with your existing technology, understanding APIs can be hugely helpful. But when you search “What is an API” on Google, you get a lot of technical jargon that might be over your head at first glance.

To better understand what an API actually is, you need to break the definition into parts. We talked to some of our more patient engineers to make sure we understand exactly why developers and businesses are so excited about our new suite of Backblaze S3 Compatible APIs.

Defining an API (Definition #1)

The abbreviation, API, stands for application programming interface. But that doesn’t really answer the question. To get started more constructively, let’s break apart the abbreviation.

Defining “Application”

Application: An application is the software, website, iOS app, or almost any digital tool that you might want to use.

Defining “Programming”

Programming: This is what the developer—the person who built the digital tool you want to use—does. They program the software that you will be using, but they also program how that software can interact with other software. Without this layer of programming, those of us who aren’t developers would have a difficult time getting different applications to work together.

Defining an “Interface”

Interface: An interface is where applications or clearly defined layers of software talk to each other. In the cases we discuss here, we will focus on APIs that talk between applications. Just as a user interface makes it easier for the average person to interact with a website, APIs make it easier for developers to make different applications interact with one another.

Defining an API (Definition #2)

Okay, so now we know what makes up an API on a basic level. But really, what does it do?

Most simply, APIs are the directions for interaction between different types of software. A common metaphor for how an API works is the process of ordering dinner at a restaurant.

When you’re sitting at a table in a restaurant, you’re given a menu of choices to order from. The kitchen is part of the “system” and they will be the ones who prepare your order. But how will the kitchen know what you are picking from the menu? That’s where your waiter comes into play. In this example, the waiter is the communicator (the API) that takes what you chose from the menu and tells the kitchen what needs to be made. The waiter then takes what the kitchen makes for you and brings it back to your table for you to enjoy!

In this example, the kitchen is one application that makes things, the customer is an application that needs things, the menu is the list of API calls you can make, and the waiter is the programming interface that communicates the order of things back and forth between the two.

An App Waiting in a Cloud Storage Sandbox

Types of APIs

With a working definition, it’s easier to understand how APIs work in practice. First, it’s important to understand the different types of APIs. Programmers choose from a variety of API types depending on the “order” you make off your “menu.” Some of these APIs have been created for public use and are widely and openly available to developers, some are for use between specific partners, and some of these APIs are proprietary, built specifically for one platform.

Public APIs

A public API is available to everyone for use (typically without having to pay any royalties to the original developer). There are no specific outlines to the way they are utilized, other than the intention that the API should be easily consumed by the average programmer and be accessible by as many different clients as possible. This helps software developers in that they don’t have to start from scratch every time they write a program—they can simply pull from public APIs.

For example, Amazon has released some public APIs that allow website developers to easily link to Amazon product information. These APIs communicate up-to-date prices so that individuals maintaining websites no longer need to update a link every time the price of a product they’ve listed changes.

Partner APIs

A partner API is publicly promoted but only shared with business partners who have signed an agreement with the API publisher. A common use case for partner APIs is a software integration between two businesses that have agreed to work together.

With our Backblaze B2 Cloud Storage product, we have many different integration partners that use partner APIs to better support customers’ unique use cases. A recent integration we have announced is with Flexify.IO. We work together using partner APIs to help customers migrate large amounts of data from one place to another (like from AWS to Backblaze B2 Cloud Storage).

Internal APIs

A private API is only for use by developers working within a single organization. The only exposure for this API is within the business and can be adjusted when needed to meet the needs of the company or their customers. When new applications are created they can be pushed out publicly for consumer use but the interface will not be visible to anyone outside of the organization.

We wish we could share an example, but that would mean the API would be public and no longer “internal.” A company like Facebook could provide a hypothetical example: Facebook owns WeChat, Instagram, and numerous other applications that they’ve worked to link together. It’s likely that these different applications speak to one another via internal APIs an open source programmer might not have access to.

Defining Backblaze APIs

Typically, companies use a variety of public, partner, and internal APIs to provide products and services. This holds true with Backblaze, too. We use a balance of our own internal APIs and we also use some public-facing APIs.

In our data centers, our Backblaze Vault architecture combines 20 Storage Pods that work collectively to store customers’ files. When a customer uploads a file using the public B2 Native API call, we use our internal APIs to break the file into 20 pieces (we call them “Shards”) and spread them across all 20 Pods. Why? This is our way of keeping your data safe in the cloud. (You can read more about that process here.)

A common public API we have implemented is the Google Single Sign-on (SSO) authentification. SSO systems allow websites to use trusted sites to verify users. As a user, you would want to use SSO to add an extra layer of security onto your account. This public API allows users with Google Accounts to access their Backblaze account with their Google credentials.

At Backblaze, our engineering team has also created some public APIs with our product, B2 Cloud Storage, that are available for anyone interested in using our cloud storage product as part of their workflow. If you have ever uploaded data using Backblaze B2, then you have used our B2 Native APIs. Backblaze supports two different suites of APIs: B2 Native APIs and, more recently, Backblaze S3 Compatible APIs.

An App Waiting in a Cloud Storage Sandbox

What Makes for a Successful API?

According to one of our lead engineers, there is a list of criteria that make an API successful or desirable. These criteria include:

      1. Ease of use for programmers
      2. Functionality
      3. Security
      4. Performance
      5. How widely it’s adopted/in use
      6. Longevity (as in, how long will your application be able to use this API without any changes)
      7. Financial cost to license or use

When designing a new API or choosing from an existing API, there are other elements that can come into play, but these are front of mind for our team.

The challenge is that oftentimes, they compete with one another. For example, PostScript has an excellent functionality and it’s widely adopted as a way to communicate with printers, but unlike some APIs, there is a high licensing cost that must be paid to Adobe (who invented the PostScript API).

There are lots of trade offs to consider when developing APIs, so it’s rare that you accomplish a clean sweep of a totally free, very fast, totally secure, easy-to-use API that also has a complete set of functionality with longevity. But that’s the ideal.

So What Does That Mean for You?

If you’ve read this far, then you’ve gained a sense for the basics of APIs. This may have even gotten you thinking about creating some APIs of your own. Either way, if you want to understand the different pieces of technology you use every day and how they communicate, APIs are the key.

The next time you’re looking into a new piece of technology that might make your work or home life easier, you’ll know to ask the question: What sort of APIs does this tool support? Will it work with my website/hardware/application? And if it works, will it work well and continue to be supported?

Do you have a unique way you explain how APIs work to your friends? Feel free to share those in the comments section below!

The post What Is an API? appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Floods, Viruses, and Volcanoes: Managing Supply Chain in Uncertain Times

Post Syndicated from Ahin Thomas original https://www.backblaze.com/blog/managing-supply-chain-in-uncertain-times/

There’s almost no way to quantify the impacts COVID-19 has had on the world. Personally, communally, and economically—there isn’t a part of our lives it hasn’t touched in some way. We’ve discussed how it’s affected our operations and our culture, but at the end of the day, the central focus at Backblaze is providing the best cloud storage and backup in the world—a mission that’s especially important in a time when cloud storage and data security has become more vital to day-to-day life than ever.

At the most basic level, our services and products rely on a singular building block: the hard drive. And today, we’re going to discuss how our team has ensured that, as more businesses and individuals turn to cloud storage to solve their rapidly evolving data storage and management needs, we’ve had what we need to care for the petabytes of inbound data.

We’re no strangers to navigating an external threat to business as usual. In 2011, flooding in Thailand impacted nearly 50% of the world’s hard drive manufacturing capability, limiting supply and dramatically raising hard drive prices. At the time, Backblaze was only about four years into providing its computer backup service, and we needed to find a way to keep up with storage demand without going broke. We came up with a hack that became internally known as “drive farming.”

What does it mean to farm hard drives? Well, everyone on our staff, and many of our friends and family members went out and bought every hard drive we could get our hands on, at every retail outlet nearby. It was a bit unconventional, but it worked to maintain our storage demand. We wrote the whole story about how we weathered that crisis without compromising our services in this blog post.

This year, most of us thought the eruption of the volcano Taal in the Philippines was going to be the biggest threat to the hard drive supply chain. We were wrong. Instead, we’ve been called to apply some of the resourcefulness we learned during the Thailand drive crisis to deal with the disruptions to production, manufacturing, and supply chains that COVID-19 has caused.

No, this isn’t “Drive Farming: Part II, the Drivening!” Rather, faced with an uncertain and rapidly shifting business environment, we turned to someone on our team who knew, even before 2020 began, that a global pandemic was a much more likely challenge to our operations than any volcano: our Senior Director of Supply Chain, Ariel Ellis.

Recently, Ahin (our VP of Marketing) sat down with Ariel to discuss how he has been managing our supply chain efforts within the context of these extraordinary times. The Q&A that follows has been edited for brevity (give a marketer a microphone…). It covers a wide range of topics: how business has changed since the emergence of COVID; how our supply chain strategy adjusted; and what it’s like for Ariel to do all of this while battling COVID himself.

A hand holding hard drives up.

Ahin Thomas: Wow! What a ride. Let’s start by understanding the baseline—what was considered “business as usual” in the supply chain before COVID? Can you give me a sense of our purchasing volumes of hard drives and who makes them?

Ariel Ellis: Pre-COVID we were buying hard drives on a quarterly basis and deploying around 20-30PB of data storage a month. We were doing competitive bidding between Seagate, Toshiba, and Western Digital—the only three hard drive manufacturers in the world.

AT: It doesn’t seem that long ago that 30PB in a year would have been a big deal! But are you saying things were pretty stable pre-COVID?

Ariel: Everything was relatively stable. I joined Backblaze in 2014 and pre-COVID, 2019 has probably been the most consistent in regards to the hard drive supply chain that I have seen during my tenure.

AT: Well that’s because neither of us was here in 2011 when the floods in Thailand disrupted the global hard drive supply chain! How did the industry learn from 2011 and did it help in 2020?

Ariel: The Thailand flooding caught the manufacturers, and the industry, off guard. Since then the manufacturers have become better at foreseeing disruptions and having contingency plans in place. They’ve also become more aware of how much routine cloud storage demand there is, so they are increasingly thoughtful about our kind of businesses and making sure supply is provided accordingly. It’s worth noting that the industry has also really shifted—manufacturers are no longer trying to provide high capacity hard drives for personal computers at places like Costco and Best Buy because consumers now use services like ours, instead.

AT: Interesting. How did we learn from 2011?

Ariel: We now have long term planning in place, directly communicate with the manufacturers, and spend more time thinking about durability and buffers.

Editor’s note: Backblaze Vaults durability can be calculated at 11 nines, and you can read more about how we calculated that number and what it means here.

I was actually brought in right after the Thailand crisis because Backblaze realized that they needed someone to specialize in building out supply strategies.

The Thailand flooding really changed the way we manage storage buffers. I run close to three to four months of already deployed, forecasted storage as a buffer. We never want to get caught without availability that could jeopardize our durability. In a crisis, this four-month buffer should provide me with enough time to come up with an alternative solution to traditional procurement methods.

Our standard four-month deployed storage buffer is designed to withstand either a sudden rise in demand (increase to our burn rate), or an unexpected shortage of drives—long enough that we can comfortably secure new materials. Lead times for enterprise hard drives are in the 90-day range, while manufacturing for our Pods is in the 120-day range. In the event of a shortage we will immediately accelerate all open orders, but to truly replenish a supply gap it takes about four months to fully catch up. It’s critical that I maintain a safety buffer large enough to ensure we never run out of storage space for both existing and new customers.

As soon as we recognized the potential risks that COVID-19 posed for hard drive manufacturing, we decided to build a cache of hard drives to last an additional six months beyond the deployed buffers. This was a measured risk because we lost some price benefits due to buying stock early, but we decided that having plenty of hard drives “in house” was worth more than any potential cost savings later in the year. This proved to be the correct strategy because manufacturers struggled for several months to meet supply and prices this year have not decreased at the typical 5% per quarter.

AT: So, in a sense, you can sacrifice dollars to help remove risk. But, you probably don’t want to pull that lever too often (or be too late to pull it, either). When did you become aware of COVID and when was it clear to you that it would have a global impact?

Ariel: As a person in charge of supply chains, I had been following COVID since it hit the media in late December. As soon as China started shutting down municipalities I recognized that this was going to have a global impact and there would be scarcities. By January most of us in the industry were starting to ask the question: How is this going to affect us? I wasn’t getting a lot of actionable feedback from any of the manufacturers, so we knew something was coming but it was very hard to figure out what to do.

AT: That’s tough—you can see it coming but can’t tell how far away it is. But seeing it in January—on a relative basis—is early. How did you get to that point?

Ariel: I’m part of the Backblaze COVID preparation team and the Business Continuity Team, which is a standing team of cross-functional leaders that are part of the overall crisis response plan—we don’t want to have to meet, but we know what to do when it happens. I also had COVID. As we were making firm business decisions on how to plan for disruptions I developed a cough, a fever, and had to take naps to make it through the day. It was brutal.

Editor’s note: Ariel was already working from home at this time per our decision to move the majority of our workforce to working from home in early March. He also isolated himself while conducting 100% of his work remotely.

In December of 2019, we realized we had to stay ahead of decision making on sourcing hard drives. We had to be aggressive and we had to be fast. We first discussed doing long term contracts with the manufacturers to cover the next 12 months. Then, as a team, we realized that contracts weren’t an option because if shelter-in-place initiatives were rolled out across the country then we were going to lose access to the legal teams and decision makers needed to make that process work. It was during the second week of March 2020 that we decided to bypass long term contracts and do the most viable thing we could think of, which was to issue firm purchase orders. A purchase order accepted between both companies is the most certain way to stay at the front of the line and ensure hard drive stock.

We immediately committed to purchase orders for the hard drives needed to cover six months out. This was on top of our typical four-month deployment buffer and would ultimately give us about 10 months of capacity. This is a rolling six months, so since then I’ve continued to ensure we have an additional six months of capacity committed.

Issuing these purchase orders required a great deal of effort and coordination across the Business Continuity Team, and in particular with our finance team. I worked side-by-side with our chief financial officer to quickly leverage the resources needed to commit to stock outside of our normal cycles. We ordered around 40,000 hard drives rapidly, which is about 400PB of usable space (meaning after parity), or roughly $10 million worth of capital equipment. Overall, this action has proved to be smart and put us one to two weeks ahead of the curve.

AT: We’re all grateful you made it through. OK, so a couple weeks into a global pandemic, while you’ve contracted COVID-19, we increased our purchasing by an order of magnitude! How are the manufacturers performing? Are we still waiting on drives from the purchase orders we issued?

Ariel: We’ve deployed many of the drives we’ve received, and we have a solid inventory of about 20,000 drives—which equals about a couple hundred petabytes of capacity—but we’ve continued to add to the open orders and are still waiting for around 20,000 drives to finish out the year. The answer to manufacturer performance changes on a constant basis. All three manufacturers have struggled due to mandated factory shutdowns, limited transportation options, and component shortages. We consistently experience small-to-medium delays in shipments, which was somewhat expected and the reason we extended our material buffers.

AT: Is there a sense of “new normal” for the buffer? Will it return to four months?

Ariel: This is going to change my world forever. Quarterly buying and competitive bid based strategies were a calculated risk, and the current crisis has caused me to rethink risk calculation. Moving forward we are going to better distribute our demand across the three manufacturers so I stay front and center if there is ever constrained supply. We will also be assessing quarterly bidding, which while price effective, gives us limited capacity and it is somewhat short-sighted. It might be more advantageous to look at six-month, and maybe even rough, 12-month capacity plans with the manufacturers.

This year has reminded me how tentative the supply of enterprise hard drives is for a company at our scale. We rely on hard drives for our life blood and the manufacturers rely on a handful of cloud storage companies like us as the primary consumers of high-capacity storage. I will continue to develop long term supply strategies with each of the manufacturers as I plan the next few years of growth.

AT: I know we are still very much in the middle of the pandemic, but have things somewhat stabilized for your team?

Ariel: From a direct manufacturing perspective we’re just now starting to see a return to regular manufacturing. In Malaysia, Thailand, and the Philippines, there were government-imposed factory shutdowns. Those restrictions are slowly being lifted and production is returning to full capacity in steps. Assuming there is no pendulum swing back to reinfection in those areas, the factories expect to return to full capacity any day now. It’s going to take a number of months for them to work through their backlog of orders, so I would expect that by October we will see a return to routine manufacturing.

It’s interesting to point out that one of the most notable impacts of COVID to the supply chain was not loss of manufacturing, but loss of transportation. In fact, that was the first challenge we experienced—factories having 12,000 hard drives ready to ship, but they couldn’t get them on an airplane.

AT: This might be a bit apocalyptic, but what was your worst case scenario? What would have happened if you couldn’t secure drives?

Ariel: We would fully embrace our scrappy, creative spirit and I would pursue any number of secondary options. For example, deploying Dell servers, which come with hard drives, or looking for recertified hard drives, drives that were made for one of the tier one hardware manufacturers but were unused and went back to the factory to be retested and recertified. Closer to actionable would have been slowing down our growth rate. While not ideal, we could certainly pump the brakes on accelerating new customer acquisitions and growth, which would extend the existing buffers and give us breathing room.

Seagate 12 TB hard drive

AT: Historically, the manufacturers have a fairly consistent cycle of increased densities and trying to drive down costs. Are you seeing any trends in drive development driven by this moment or is everyone simply playing catch-up?

Ariel: It’s unclear how much this year has slowed down new technology growth as we have to assume the development labs haven’t been functioning as normal. Repercussions will become clear over the next six months, but as of now I have to assume that the push towards higher capacities, and new esoteric technologies to get to those higher capacities, has been delayed. I expect that companies are going to funnel all of their resources into current platforms, meeting pent up demand, and rebuilding their revenue base.

AT: Obviously, this is not a time for predicting the future—anyone who had been asked about 2020 in February was likely very wrong, after all—but what do you see in the next 12 months?

Ariel: I don’t think we’ve seen all of the repercussions from manufacturing bottlenecks. For example, there could be disruption to the production of the subcomponents required to make hard drives that the manufacturers have yet to experience because they have a cache of them. And whether it’s through lost business or lost potential, the hard drive manufacturers’ revenue streams are going to take a hit. We are deeply vested in seeing the hard drive manufacturers thrive so we hope they are able to continue business as usual and are excited to work with us to grow more business.

I think there will also be a further shift towards hard drive manufacturers relying on their relationship with cloud storage providers. During COVID, cloud service providers either saw no decline in business or an increase in business with people working from home and spending more time online. That is just going to accelerate the shift of hard drive production going exclusively towards large scale infrastructure instead of being dispersed amongst retail products and end-users across the planet.

• • •

This year doesn’t suffer from a lack of unexpected phenomena. But for us, it is especially wild to sit here in 2020—just nine years after our team and our customers were scouring the country to “drive farm” our way to storage capacity—and listen to our senior director of supply chain casually discussing his work with hard drive manufacturers to ensure that they can thrive. Backblaze has a lot of growth yet in its future, but this is one of those moments that blows our hair back a bit.

Even more surprising is that our team hasn’t changed that much. Something Ariel mentioned after we finished our conversation was how, when he had to go offline due to complications from COVID, the rest of our team easily stepped in to cover him in his absence. And a lot of those folks that stepped into the breach were the same people wheeling shopping carts of hard drives out of big box stores in 2011. Sure, we manage more data and employ more people, but when it comes down to it, the same scrappy approach that got us where we are today continues to carry us into the future.

Floods, viruses, volcanoes: We’re going to have more global disruptions to our operations. But we’ve got a team that’s proven for 14 years that they can chart any uncertain waters together.

The post Floods, Viruses, and Volcanoes: Managing Supply Chain in Uncertain Times appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

How to Leave AWS: Backblaze S3 Compatible APIs & Free Cloud to Cloud Migration

Post Syndicated from Ahin Thomas original https://www.backblaze.com/blog/aws-to-backblaze-migration/

Spoiler alert: At the end of this post, we announce our Cloud to Cloud Migration program—an offer to pay the transfer costs for customers that want to migrate their data from Amazon S3 to Backblaze B2 Cloud Storage. Yup, we’re so confident in our service that we’ll pay for you to save 75% of your cloud storage bill. If you want to stop reading and start saving: Click here.

On May 4th, we released the beta version of our Backblaze S3 Compatible APIs. It was our most requested feature, so we knew it was something our customers wanted. But what we’ve seen has been simply incredible—thousands of customers uploading petabytes of data to Backblaze B2 Cloud Storage. Today, we’re moving those S3 Compatible APIs out of beta and into general availability (GA), as we continue to remove the barriers that are keeping your data locked within the cloud oligarchy.

Not Just Compatible, VERY Compatible

At Backblaze, we take great pride in our track record of enduring innovation. Whether it’s our Storage Pod, Reed-Solomon erasure coding, or our S3 Compatible APIs, when we release something to the public, our sense of craftsmanship demands that it should just work.

By moving our S3 Compatible APIs into GA, we’re announcing that you can expect a bug free, highly functional, and stable experience.

Our functionality and associated infrastructure is fully ready for global, exabyte scale business. Nick Craig-Wood, the founder of rclone, put it well. Our Backblaze S3 compatible layer is not just compatible—according to him, it’s “Very compatible.”

“Very Compatible” Means It Works with Your Workflows

The testing suites are helpful, but what matters most is the actual experience customers are having. During the beta period, more than 1,000 unique S3-compatible tools were used by customers to interact with our new Backblaze S3 compatible layer. We know this because of something called a user agent—an identifier that tells our servers what tool is being used to upload data. Our monitoring looks at the system as a whole as well as at user agents to make sure things are performing as planned. And everything is going smoothly!

Tools like CloudBerry, MinIO, and Synology are among the third parties that had pre-existing integrations with Backblaze but now also have customers uploading through our S3 Compatible APIs. Perhaps more notably, customers brought a wide variety of tools that we had not previously seen upload data to Backblaze. User agents from Commvault, Cohesity, and Veeam all register now in our internal reporting. As do many AWS SDKs and even AWS Lambda.

The recurring theme? Customers can point their existing workflows and tools at Backblaze B2 and not miss a beat.

What It Means for Our Customers

During our beta period, we’ve seen literally thousands of success stories. One of our favorites comes from a company called CloudSpot—a software as a service platform offering photographers the easiest way to share their work online.

CloudSpot’s storage infrastructure is a critical component of both their product and P&L. But, with his company scaling, CEO Gavin Wade realized that his data (and company) were captive to Amazon S3. As CloudSpot grew, his storage related costs threatened to turn his business upside down.

With over 700TB stored and data transfer fees starting at 9 cents/GB, Gavin felt stuck inside of Amazon. He had to start cutting back valued functionality for his customers simply because the AWS pricing was untenable.

With B2 Cloud Storage—which is one-fourth of the cost of Amazon S3—Gavin has slashed his cloud bill, freeing up cash for critical investments in his business and team. After seeing a seamless transition for his active workflows, he migrated over 700TB from Amazon S3 to B2 Cloud Storage in less than six days. Most importantly, there was no service disruption.

“With Backblaze, we have a system for scaling infinitely. It lowers our breakeven customer volume while increasing our margins, so we can reinvest back into the business. My investors are happy. I’m happy. It feels incredible.”

—Gavin Wade, Founder & CEO, CloudSpot

The CloudSpot story is a good one, but it’s just one of the many from our beta period. Customers are tired of being taxed to use their data and need a platform that will not punish them for scaling. We’re grateful that these customers are migrating to Backblaze and have accelerated the month over month growth rate of B2 Cloud Storage by more than 25%. Today, we’d like to encourage you to join those that have liberated their data while significantly reducing their costs.

Transcend the Cloud Oligarchy: Backblaze Will Pay for You to Move Your Data

People want storage that will empower their business and they want a provider that doesn’t try to hide fine print or surprise fees. With our S3 Compatible APIs now generally available, they can have both, all while knowing that their tools and workflows will seamlessly integrate with Backblaze B2.

But customers still need a solution for the excessive fees the cloud oligarchy charges for migrating to a cloud that is better for the customer’s business.

Today, we’re proud to remove the last obstacle between you and shaving 75% off your cloud storage costs: Cloud to Cloud Migration.

And for customers that don’t want to commit to storing it for 12 months? No problem, you can still use our service to directly transfer your data from Amazon S3 to Backblaze B2 for 4 cents/GB. By month three, your storage savings will have paid for the migration.

And if you’re interested in trying Backblaze first? Creating an account is free, your first 10GB of storage are free, and there’s never been a better time to start.

Thanks to all the people, companies, and partners that helped make the beta period such a success. We are excited about what the future holds and are glad that you are coming with us.

The post How to Leave AWS: Backblaze S3 Compatible APIs & Free Cloud to Cloud Migration appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Cloud Storage for a Perfect “Good Eats” Recipe

Post Syndicated from Lora Maslenitsyna original https://www.backblaze.com/blog/cloud-storage-for-a-perfect-good-eats-recipe/

If you’ve ever wondered about the science behind some of your favorite recipes, then you may have come across Alton Brown and his cooking show, “Good Eats.” Equal parts smart and sardonic, “Good Eats” showed its viewers how to whip up an excellent dish all while teaching about the history and science of the recipes through wacky sketches.

After the popular show ended in 2012, Brown used to tease on social media about a possible comeback. In a moment of serendipity, Eric Bigman, a seasoned video editor and long-time fan of “Good Eats,” found Brown’s business card in a stationary shop in New York City. The next time that Brown posted a hint of reviving the show, Bigman took a chance and emailed Brown directly. He got in touch at just the right time—Brown would go on to hire Bigman to help update some classic episodes as “Good Eats: Reloaded” for the Cooking Channel, and to create fresh episodes as “Good Eats: The Return” for the Food Network.

We’re sharing the story of why Bigman and the team chose to transition from Amazon S3 to Backblaze B2 Cloud Storage as a key ingredient in their infrastructure for “Good Eats: The Return” and “Good Eats: Reloaded,” how this move saved time ongoing by a factor of 100, and how it also eliminated failures worse than any overcooked egg or burnt cake.

Perfecting Recipes and Workflows

To refresh the classic episodes for “Good Eats: Reloaded,” Bigman had to blend together the old footage which had been degraded throughout the retrieval process with new widescreen, high-definition footage. Since he was adjusting to a new team and process, Bigman waited until the end of that first season to archive his data using Amazon S3 while simultaneously trying to finish post-production. He knew that uploading to S3 would take a long time—time he couldn’t spare while trying to deliver 13 episodes on a deadline.

When Bigman’s team then started production for “Good Eats: The Return,” he wanted to work toward a more fluid, integrated process. He started backing up every other week. But he was still facing the same problem as before: The data backup process to Amazon S3 was too time-consuming. What’s more is that Alton Brown was worried about their backups, too. The show meant a lot to him and he didn’t want all of his team’s hard work to disappear in some overnight mishap.

From then on, Bigman tried to back up every night, but he was growing more and more frustrated with Amazon S3. It seemed to him like nine times out of 10, he walked into a pipeline failure in the morning. Babysitting the backup process took up valuable production time during his day, and he felt he couldn’t trust AWS to complete the backup overnight.

Real-Time Solutions Are Essential in Backups, and Cooking

Bigman turned to media solutions integrator CineSys-Oceana for a better backup and archiving solution. They suggested Backblaze B2. Bigman also chose Cyberduck, a libre server and cloud storage browser that integrates with Backblaze B2, to upload data. Now, when they’re in production, Bigman keeps the Cyberduck browser open and continuously uploads to B2 Cloud Storage.

Bigman originally intended to move the show’s archives from Amazon S3 to Glacier to reduce costs. But with Backblaze B2, he doesn’t have to balance access to footage against his budget.
That instant access is critical during filming. Bigman often pulls footage for continuity checks. The real-time workflow and Backblaze’s simplicity ensure that he can always access footage he needs.

Backblaze B2 Makes Remote Work Possible

When they started production on the second season of “Good Eats: Reloaded,” Bigman moved to working from home in New Jersey while the rest of the team continued work in Atlanta. The fully cloud-based setup ensured they could keep their post-production process going without any problems. Bigman’s assistant in Atlanta easily accesses data through the Backblaze website. If Bigman needs a file quickly, his assistant logs in to the Backblaze website and uses the web GUI to drag and drop. It’s quick, easy, and helps spread the work out.

“Good Eats”: A Fine Example of Good Storage

With seamless workflows, instant access, cost-effective storage rates, and virtually zero upload failures, Backblaze saves Bigman the most critical resource on a quick-turnaround production—time. He says that “Time saved is by a factor of 100. I just don’t have to think about it anymore. It’s done, and that means I’m done.”

Read the full case study about how Alton Brown’s post-production team unlocked a seamless, remote workflow and quick backups that let them focus on producing a beloved show.

The post Cloud Storage for a Perfect “Good Eats” Recipe appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

We’ve Got The Solutions (Engineers) For You

Post Syndicated from Ramya Ramamoorthy original https://www.backblaze.com/blog/weve-got-the-solutions-engineers-for-you/

At Backblaze, we are fortunate to serve hundreds of thousands of customers in more than 150 countries. To make this possible, we have a Solutions Engineering team whose main goal is to help existing and potential customers succeed with the technical implementation of their cloud-based workflows. Keep reading to learn how this team of four forms the technical backbone of the Sales team, and how they can help you with challenges you might experience while enabling cloud solutions at your business.

What Makes Backblaze Solutions Engineers Different

In a traditional sales environment, there is a pre- and post-sales engineer. (Want to know more about the difference between sales and solutions engineers? More on that later.) The pre-sales engineer addresses technical questions that a customer may have prior to purchasing the product, while a post-sales engineer assists them with setting up the software and its integrations.

The first thing that sets our solutions engineers apart is that our customers at Backblaze get to work with the same team of people on both sides of the transaction. Their journey starts with an introductory call with a business development representative (BDR), who tries to get a better understanding of the client’s needs and concerns. Once the BDR qualifies the customer, they transfer them to an account executive (AE), who manages the customer’s account.

AEs work closely with solutions engineers (SEs) and ask them to step in when clients have unique technical requests. SEs begin by asking the customer questions to understand the full scope of the problem and offer them the best solution. The types of queries range from topics like writing scripts to partner integrations (more on these subjects later).

But once the problem is solved, and the customer is up and running, they aren’t passed off to another team, as is often the case in other operations. The SEs remain in conversation with customers to ensure they continue to get what they need from our products.

Another thing that sets our SE team apart is that they do not have quotas—their primary goal is to help the customer find the best solution to their needs, not to optimize their potential value for Backblaze. Since solving problems is their sole objective, their titles changed from “sales engineer” to “solutions engineer” in 2018.

Our Solutions Engineering team is built differently because we have a unique approach to business. The SEs work to create long-term relationships with customers because they want to help them succeed throughout the future.

The Team

Our Solutions Engineering team is made up of a solutions engineer manager and three solutions engineers. While all four of them come from diverse backgrounds, they are all well-versed in every aspect related to being an SE (such as writing to our API, testing integrations, and providing workflow recommendations to users), so each of them can step into another person’s role at any given time if needed.

Troy Liljedahl, the Solutions Engineer Manager, started his career at Backblaze as a support technician. He transitioned over to the Solutions Engineering team because he wanted to pursue his passion for helping people, but in a more technical manner. He explained, “The solutions engineer job is the best of both worlds—not only am I getting to help customers succeed, but I am also able to do that by being a technical resource for them.”

Troy trains and manages the three other solutions engineers. Udara Gunawardena, one of the SEs, worked in a technical role at an Apple store during his college days. After graduating and becoming a full stack engineer, he realized that he missed interacting with users. He discovered solutions engineering and quickly made a career switch. Now, his programming background helps him with a number of tasks such as assisting customers who are trying to write to our API.

Another one of our solutions engineers joined the team around the same time as Udara. She completed her master’s in engineering management and worked in sales at a SaaS company. Now, her sales expertise helps her communicate with customers, solve their technical issues, and build strong relationships with them.

Rounding out the team is Mike Farace, who unlike his peers, works with us remotely, stationed all the way over in Ukraine. Although he may be physically distant, his relationship with Backblaze is one of the oldest. He worked with the founders of Backblaze between 2003 to 2006 while working as an IT architect at MailFrontier. He was the first contractor for Backblaze and set up the VPN so that the five founders could remotely access the testing servers that they were using in Brian Wilson’s apartment. Throughout the years, he was involved in some Backblaze projects and when there was the right fit for him, he moved into a full-time role as a solutions engineer.

Finding the Best Solutions for You

Broadly speaking, solutions engineers are responsible for understanding how our product works on its own and in integrations with other products, making recommendations based on that knowledge, and helping customers implement personalized workflows. Because Backblaze B2 Cloud Storage works with hundreds of different integrations, our SEs have extensive problem solving capacity. For example, they helped AK Productions use Flexify.IO to transfer their data from Google Cloud to Backblaze B2.

The SEs also have a wide understanding of the best solutions that work for different industries—the right solution for a media and entertainment customer may be different than the best one for an IT professional.

When teams begin working with new customers, they work to educate them on the different tools that are available. If a customer has a particular integration in mind, SEs can tell them about the benefits/disadvantages of using one integration over another and help them find the solution that matches their needs. Some topics that SEs often address are lifecycle rule settings, encryption, and SSO. They also help customers think through potential issues that they may not have yet considered.

One of the more technical aspects of the SE role is helping customers write to our API. Udara explained that this is his favorite part of the job: “The really creative stuff happens when people are trying to write to our API. Some people might be trying to stream video from surveillance cameras dynamically to Backblaze B2 whereas another company will have game streamers trying to save their video captures onto our cloud storage product. There’s a gamut of ways in which people use Backblaze B2 and that makes it really exciting.”

SEs also make sure they understand the full scope of a customer’s needs. They ask questions, even if they think they know the answer. Troy explained, “Although there will be patterns between customers, we truly look at every customer as unique and every setup as unique.” The team also does a great job of speaking the language of the customers—while some clients may have a technical background, others may not. Regardless of the customer’s technical expertise, SEs can give them a high-level overview of the solution without making them feel overwhelmed.

Another important aspect of the SE’s role is their ability to work cross-functionally with the other teams at Backblaze. Apart from the Sales team, they work closely with the Engineering team, especially when helping major clients. Udara said, “We have to communicate and collaborate effectively with the Engineering team to increase performance and to solve the issues that our clients may have.”

On the other hand, when SEs are working with smaller customers, they collaborate with the Customer Support team. Udara further explained, “The Customer Support team has seen the entire gamut of use cases, especially for Backblaze Computer Backup. They’re a great resource to us because they have the answers to even the smallest issues.”

The SEs are equally passionate as they are proficient at doing their jobs. Mike particularly loves the problem solving aspect to the job because it feels like a puzzle. He explained, “When you have a puzzle, you have to figure out what it looks like now, what it’s supposed to look like, and how to get it there. On top of that, when helping users, you’re dealing with different constraints like the customer’s budget and technology. It’s always a different puzzle to solve, which makes the job exciting.”

Testing Partner Integrations

When SEs have free time between talking to customers, they act as quality assurance for the Sales team by testing integrations—both those that could potentially work with Backblaze B2 and those that are currently being used. While doing so, they simultaneously document every step and when the integration testing is complete, they work with Customer Support to write a knowledge base article. These articles are available online for customers as a help guide so that they can learn how to use Backblaze B2 with our integration partners.

Back in October 2019, one of our customers wanted to use Backblaze B2 with EditShare Flow, which is a media asset management (MAM) all-in-one workflow solution that allows video editors to collaborate in the cloud. One of our solutions engineers, who had just joined the team, took on the responsibility of testing the integration. This partnership could potentially allow creative professionals to edit their videos on EditShare Flow and store their content on Backblaze B2.

Since the SE was new to the team and still learning our products, she worked with her teammates to learn about the different terms involved such as “MAM” and “metadata.” Once she learned more about EditShare Flow and its intricacies, she was ready to start the integration testing. She got the metadata from a NAS device, then she pivoted and pushed the data from that NAS device to Backblaze B2. She also tested the upload speed and ensured that the interface was user-friendly.

While testing the integration, she noted each step for setting it up which would later help her when writing a knowledge base article about the integration. After the integration testing was complete, she sent the customer all the documentation he needed to use Backblaze B2 with EditShare Flow. The customer was happy and if he had any further questions, he could always reach out to the SE.

When a company applies to be featured as an integration partner on our website, Mike tests the integration to ensure that the product can be paired successfully with Backblaze B2. Sometimes, companies that are already on the website make changes to their software, so Mike tests those products as well. He and the rest of the team also proactively test current integrations to verify that they are working and to identify any areas of improvement.

Demo Environment

SEs have a demo environment to show prospective customers how different features within Backblaze Computer Backup can play out. Mike maintains this demo environment, which is set up with pre-created accounts and consists of a large VMware server with over 140 machines running. He creates different demos ranging from ones that show how to add a new computer to those showing how to restore files on behalf of a user. There are even some computers in the environment that are in an error state, so that customers can see what that might look like, as well.

How Can You Become a Solutions Engineer?

If becoming a solutions engineer sounds like your ideal career, Troy offered the following advice, “The biggest skill that I look for when hiring a solutions engineer is the ability to ask the right questions. We need to identify a customer’s problem in order to offer them the right solution. Another quality I look for is the ability to talk in the same language as a customer, whether that means giving someone a high-level technical overview of the product or simplifying concepts so that it’s easier for a customer to understand.”

Troy continued, saying that although candidates need not be programmers, they should feel comfortable reading code and potentially writing scripts. However, this varies between different companies. At Backblaze, the role of an SE strikes a balance between being technical and customer service-oriented, but at other companies, the role may be more technical or more customer support-focused.

Reach Out!

If you are interested in learning more about how you could use Backblaze for your organization, click here to contact Sales and begin your journey with Backblaze! We look forward to hearing from you.

The post We’ve Got The Solutions (Engineers) For You appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Cloud University—Online and in Session Today

Post Syndicated from Janet Lafleur original https://www.backblaze.com/blog/cloud-university-online-and-in-session-today/

When the COVID-19 pandemic forced many of us to shelter in place at home, it seemed like a good opportunity to learn news things: Baking sourdough bread, sure! Tackling 2000 piece puzzles, great. Hosting virtual birthday parties on Zoom for three dozen family members? Got it! It was fun (mostly), but it quickly became clear that what seemed like a pause was actually the beginning of a whole new way of living and doing business.

And for anyone responsible for media workflows, turning to online learning during this time quickly became a clear requirement. The NAB Show, the media and entertainment industry’s huge show every April, cancelled their live event in Las Vegas and rebirthed it as an online experience a month later.

During normal times, this show was everyone in the industry’s opportunity to learn what the newest essential solutions for their workflow might be. In 2020, this wasn’t possible, despite the fact that learning how to use cloud-based tools to enable work is no longer a long-term strategic “hope for”—it’s a must-have.

After the cancellation of trade shows and other in-person events, our marketing and sales teams got together to decide how they could reach our NAB audience in the absence of face-to-face discussions at our booth on the NAB Show floor. The answer was Cloud University: an ongoing series of webinars highlighting powerful, cloud-enabled workflow solutions built with over a dozen partners, including ProMAX, CatDV, Cloudflare, iconik, and more. This series of free courses features live demos, tips, and best practices on topics like remote collaboration, content delivery, cloud migration, and workflow automation, with more to come each week.

To sign up for the next installment of Cloud University, you can go here. To be informed about upcoming classes, you can sign up for our media and entertainment newsletter here.

Cloud University: Welcome to Your Class Schedule

You can always head to our Cloud University page to stay up to date with the latest classes, but we’ll include digests of past classes here, as well as callouts to additional information that might be useful to you, and some key takeaways to scan so you can tell what webinar might be most effective for you.

We hope to see you at our next online class, and of course, if there’s anything you’d like us to cover at Cloud University, don’t hesitate to say so in the comments!

DEMO LABORATORY: Turnkey Cloud Backup, Sync, and Restore with QNAP NAS

Register for May 19th

Want your NAS to do more than just store content? A NAS solution with built-in data protection tools can safeguard your content, too. Join this live demo and learn how to:

  • Determine when to use backup vs. sync functions based on your workflow
  • Enable your users to restore files and folders without any IT support
  • Deploy a hybrid cloud—for the best of local and off-premise storage—in minutes
  • Select the right NAS solution for your environment

CONTENT MONETIZATION 101: Unlock More Revenue from Existing Content with GrayMeta

Register for May 21st

How can you monetize content quicker by discovering information hidden inside your media files? Attend this webinar to learn how to:

  • Use machine learning and AI to extract metadata from content scattered across SAN, NAS, and cloud storage
  • Uncover actionable insights from words, sounds, faces, and logos detected in video and other assets
  • Build time-saving workflows that identify, move, and transform content automatically
  • Make your content monetization workflow more efficient to shorten time to revenue

WORKFLOW AUTOMATION 101: Fast and Foolproof Production Workflows with CatDV

Register for May 28th

How much faster could you finish projects if your workflow adapted to new demands without costly, time-consuming development? Attend this event to learn how you can automate:

  • High-volume ingest and content processing using cloud compute and storage
  • Proxy creation, custom tagging, and metadata extraction without undue compute costs
  • Transcode and upload of files to playout servers as soon as clips are marked “approved”
  • Review and approval process, and delivery to platforms like YouTube or Vimeo

DEMO LABORATORY: Effortless Multi-Office File Exchange for Real-Time Collaboration with Synology

Register for June 9th

How much easier would your life be if your team members at different sites could share large files efficiently, with everything protected in the cloud? Attend this live demo to learn how to:

  • Set up cross-office collaboration with faster, simpler local access
  • Reduce bandwidth consumption by syncing files on multiple NAS devices just once
  • Generate shared links, files, or folders to deliver content to external clients
  • Efficiently and easily backup and sync all your NAS devices to the cloud

CONTENT DISTRIBUTION 101: Ultra-Fast Worldwide Content Delivery with Cloudflare

Start Learning Now

Are you ready to take your content worldwide? You, too, can have a simple, fast, incredibly cost effective solution for your website or content service up and running in minutes. Attend this event to learn how to:

  • Improve end-user experience with global caching and optimization
  • Eliminate egress fees for content movement
  • Compare cost savings of Backblaze and Cloudflare vs. S3 and Cloudfront
  • Quickly build a pilot workflow at no cost

CONTENT AGILITY 101: Unlocking Immediate Content Value for Remote Teams & Partners with Imagen

Start Learning Now

Do you need to unlock value in your existing assets and keep your remote teams productive? You can watch this recording to learn:

  • How broadcaster, sports, and corporate video creative teams can manage, distribute, and deliver content remotely
  • How Imagen’s monetization and delivery features integrate with Backblaze B2 Cloud Storage
  • How a sports video customer quickly pivoted from live events using existing content
  • How Carleton College deployed a student content solution in only five days

REMOTE PRODUCTION 101: Full-Resolution Editing Environments for Distributed Teams with ProMax

Start Learning Now

Do your video editing teams need to work remotely on short notice, without sacrificing access to content stored on the main shared system? Watch this class to learn how you can:

  • Rapidly deploy remote content servers that deliver full-resolution content to editors
  • Keep remote and main studio content in sync, even with limited remote bandwidth
  • Protect the entire content library and add flexible content delivery with cloud storage
  • Reintegrate the remote environments back into the main shared system, ready to deploy to the next site

DEMO LABORATORY: Truly Painless Backup and Sync with Goodsync

Start Learning Now

Do you need rock-solid backup and synchronization of your critical systems and high-value folders, but aren’t sure where to start? Watch this class to learn how to set up reliable backup and sync in minutes, and pick up best practices along the way. Topics covered include:

  • Goodsync overview for Mac, Windows, Linux, and NAS systems
  • One minute setup—configuring Goodsync with Backblaze B2 Cloud Storage
  • Five different ways to initiate backup and synchronization jobs
  • Best practices with real-world examples from our customers

DEMO LABORATORY: Cloud-to-Cloud Data Migration with Flexify.IO

Start Learning Now

Want your cloud storage spending to go farther for you, but concerned about the cost or complexity of moving between clouds? Fear not. In this class, you’ll learn how Flexify.IO and Backblaze can help you:

  • Easily and inexpensively transfer data from one cloud provider to another
  • Eliminate downtime during cloud-to-cloud migration
  • Choose the right cloud storage to meet your workflow needs

DEMO LABORATORY: A Hybrid Cloud Approach to Media Management with iconik

Start Learning Now

In their move toward cloud workflows, content owners are looking for solutions that manage content stored on-premises seamlessly with content in the cloud. Backblaze partner iconik built their smart media management system with this hybrid cloud approach in mind. With iconik, you don’t have to wait for all your content to be in the cloud before you and your creative team can take advantage of their cloud-based platform. In this class, iconik expert Mike Szumlinski details how you can:

  • Get started with cloud-based media management without migrating any content
  • Search and preview all your content from any device, anywhere
  • Add collaborators on the fly, including view-only licenses at no charge
  • Instantly ingest content stored in Backblaze B2 Cloud Storage into iconik

The post Cloud University—Online and in Session Today appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Media Stats 2019: Top Takeaways From iconik’s New Report

Post Syndicated from Skip Levens original https://www.backblaze.com/blog/media-stats-2019-top-takeaways-from-iconiks-new-report/

Recently, the team at iconik, a popular cloud-based content management and collaboration app, released a stats-driven look at how their business has grown over the past year. Given that we just released our Q1 Hard Drive Stats, we thought now was a good time to salute our partners at iconik for joining us in sharing business intelligence to help our industries grow and progress.

Their report is a fascinating look inside a disruptive business that is a major driver of growth for Backblaze B2 Cloud Storage. With that in mind, we wanted to share our top takeaways from their report and highlight key trends that will dramatically impact businesses soon—if they haven’t already.

Takeaway 1: Workflow Applications in the Cloud Unlock Accelerated Growth

iconik doubled all assets in the final quarter of 2019 alone.

Traditional workflow apps thrive in the cloud when paired with active, object storage.

We’ve had many customers adopt iconik with Backblaze B2, including Everwell, Fin Films, and Complex Networks, among several others. Each of these customers not only quickly converted to an agile, cloud-enabled workflow, they also immediately grew their use of cloud storage as the capacities it unlocked fueled new business. As such, it’s no surprise that iconik is growing fast, doubling all assets in Q4 2019 alone.

iconik is a prime example of an application that was traditionally installed on physical servers and storage in a facility. A longtime frustration with such systems is trying to ‘right-size’ the amount of server horsepower and storage to allocate to the system. Given how quickly content grows, making the wrong storage choice could be incredibly costly, or incredibly disruptive to your users as the system ‘hits the wall’ of capacity and the storage needs to be expanded frequently.

By moving the entire application to the cloud, users get the best of all worlds: a responsive and immersive application that keeps them focused on collaboration and production tasks, protection for the entire content library while keeping it immediately retrievable, and seamless growth to any size needed without any disruptions.

And these are only the benefits of moving your storage solution to the cloud. Almost every other application in your workflow that traditionally needs on-site servers and storage can be similarly shifted to the cloud, lending benefits like “pay-as-you-use-it” cost models, access from everywhere, and the ability to extend features with other cloud delivered services like transcoding, machine learning, AI services, and more. (Our own B2 Cloud Storage service just launched S3 Compatible APIs, which allows infinitely more solutions for diverse workflows.)

Takeaway 2: Now, Every Company Is a Media Company

41% of iconik’s customer base are not from traditional media and entertainment entities.

Every company benefits by leveraging the power of collaboration and content management in their business.

Every company generates massive amounts of rich content, including graphics, video, product and sales literature, training videos, social media clips, and more. And every company fights ‘content sprawl’ as documents are duplicated, stored on different department’s servers, and different versions crop up. Keeping that content organized and ensuring that your entire organization has perfect access to the up-to-the-minute changes in all of it is easily done in iconik, and now accounts for 41% of their customers.

Even if your company is not an ad agency, or involved in film and television production, thinking and moving like a content producer and organizing around efficient and collaborative storytelling can transform your business. By doing so, you will immediately improve how your company creates, organizes, and updates the content that carries your image and story to your end users and customers. The end result is faster, more responsive, and cleaner messaging to your end users.

Takeaway 3: Solve For Video First

Video is 17.67% of all assets in iconik—but 78.36% of storage used.

Make sure your workflow tools and storage are optimized for video first to head off future scaling challenges.

Despite being a small proportion of content in iconik’s system, video takes up the most storage.
While most customers have large libraries of HD or even SD content now, 4K size video is rapidly gaining ground as it becomes the default resolution.

Video files have traditionally been the hardest element of a workflow to balance. Most shared storage systems can serve several editors working on HD streams, but only one or two 4K editors. So a system that proves that it can handle larger video files seamlessly will be able to scale as these resolution sizes continue to grow.

If you’re evaluating changes in your content production workflow, make sure that it can handle 4K video sizes and above, even if you’re predominantly managing HD content today.

Takeaway 4: Hybrid Cloud Needs to Be Transparent

47% of content stored locally, 53% in cloud storage.

Great solutions transparently bridge on-site and cloud storage, giving you the best features of each.

iconik’s report calls out the split of the storage location for assets it stores—whether on-site, or in the cloud. But the story behind the numbers reveals a deeper message.

Where assets are stored as part of a hybrid-cloud solution is a bit more complex. Assets in heavy use may exist locally only, while others might be stored on both local storage and the cloud, and the least often used assets might exist only in the cloud. And then, many customers choose to forego local storage completely and only work with content stored in the cloud.

While that may sound complex, the power of iconik’s implementation is that users don’t need—and shouldn’t need to know—about all that complexity. iconik keeps a single reference to the asset no matter how many copies there are, or where they are stored. Creative users simply use the solution as their interface as they move their content through production, internal approval, and handoff.

Meanwhile, admin users can easily make decisions about shifting content to the cloud, or move content back from cloud storage to local storage. This means that current projects are quickly retrieved from local storage, then when the project is finished the files can move to the cloud, freeing up space on local storage for other active projects.

For customers working with Backblaze B2, the cloud storage expands to whatever size needed on a simple, transparent pricing model. And it is fully active, or in other words, it’s immediately retrievable within the iconik interface. In this way it functions as a “live” archive as opposed to offline content archives like LTO tape libraries, or a cold storage cloud which could require days for file retrieval. As such, using ‘active’ cloud storage like Backblaze B2 eases the admin’s decision-making process about what to keep, and where to keep it. With transparent cloud storage, they have the insight needed to effectively scale their data.

Looking into Your (Business) Future

iconik’s report confirms a number of trends we’ve been seeing as every business comes to terms with the full potential and benefits of adopting cloud-based solutions:

  • The dominance of video content.
  • The need for transparent reporting and visibility of the location of data.
  • The fact that we’re all in the media business now.
  • And that cloud storage will unlock unanticipated growth.

Given all we can glean from this first report, we can’t wait for the next one.

But don’t take our word for it, you should dig into their numbers and let us and iconik know what you think. Tell us how these takeaways might help your business in the coming year, or where we might have missed something. We hope to see you in the comments.

The post Media Stats 2019: Top Takeaways From iconik’s New Report appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Backblaze Hard Drive Stats Q1 2020

Post Syndicated from Andy Klein original https://www.backblaze.com/blog/backblaze-hard-drive-stats-q1-2020/

Backblaze Drive Stats Q1 2020

As of March 31, 2020, Backblaze had 132,339 spinning hard drives in our cloud storage ecosystem spread across four data centers. Of that number, there were 2,380 boot drives and 129,959 data drives. This review looks at the Q1 2020 and lifetime hard drive failure rates of the data drive models currently in operation in our data centers and provides a handful of insights and observations along the way. In addition, near the end of the post, we review a few 2019 predictions we posed a year ago. As always, we look forward to your comments.

Hard Drive Failure Stats for Q1 2020

At the end of Q1 2020, Backblaze was using 129,959 hard drives to store customer data. For our evaluation we remove from consideration those drives that were used for testing purposes and those drive models for which we did not have at least 60 drives (see why below). This leaves us with 129,764 hard drives. The table below covers what happened in Q1 2020.

Backblaze Q1 2020 Annualized Hard Drive Failure Rates

Notes and Observations

The Annualized Failure Rate (AFR) for Q1 2020 was 1.07%. That is the lowest AFR for any quarter since we started keeping track in 2013. In addition, the Q1 2020 AFR is significantly lower than the Q1 2019 AFR which was 1.56%.

During this quarter 4 (four) drive models, from 3 (three) manufacturers, had 0 (zero) drive failures. None of the Toshiba 4TB and Seagate 16TB drives failed in Q1, but both drives had less than 10,000 drive days during the quarter. As a consequence, the AFR can range widely from a small change in drive failures. For example, if just one Seagate 16TB drive had failed, the AFR would be 7.25% for the quarter. Similarly, the Toshiba 4TB drive AFR would be 4.05% with just one failure in the quarter.

On the contrary, both of the HGST drives with 0 (zero) failures in the quarter have a reasonable number of drive days, so the AFR is less volatile. If the 8TB model had 1 (one) failure in the quarter, the AFR would only be 0.40% and the 12TB model would have an AFR of just 0.26% with 1 (one) failure for the quarter. In both cases, the 0% AFR for the quarter is impressive.

There were 195 drives (129,959 minus 129,764) that were not included in the list above because they were used as testing drives or we did not have at least 60 drives of a given model. For example, we have: 20 Toshiba 16TB drives (model: MG08ACA16TA), 20 HGST 10TB drives (model: HUH721010ALE600), and 20 Toshiba 8TB drives (model: HDWF180). When we report quarterly, yearly, or lifetime drive statistics, those models with less than 60 drives are not included in the calculations or graphs. We use 60 drives as a minimum as there are 60 drives in all newly deployed Storage Pods.

That said, all the data from all of the drive models, including boot drives, is included in the files which can be accessed and downloaded on our Hard Drive Test Data webpage.

Computing the Annualized Failure Rate

Throughout our reports we use the term Annualized Failure Rate (AFR). The word “annualized” here means that regardless of the period of observation (month, quarter, etc.) the failure rate will be transformed into being an annual measurement. For a given group of drives (i.e. model, manufacturer, etc.) we compute the AFR for a period of observation as follows:

AFR = (Drive Failures / (Drive Days / 366) * 100

Where:

  • Drive Failures is the number of drives that failed during the period of observation.
  • Drive Days is the number of days all of the drives being observed were operational during the period of observation.
  • There are 366 days in 2020, obviously in non-leap years we use 365.

Example: Compute the AFR for the Drive Model BB007 for the last six months given;

  • There were 28 drive failures during the period of observation (six months).
  • There were 6,000 hard drives at the end of the period of observation.
  • The total number of days all of the drives of drive model BB007 were in operation during the period of observation (6 months) totaled 878,400 days.

AFR = (28 / (878,400 / 366)) * 100 = (28 / 2,400) * 100 = 1.17%

For the six month period, drive model BB007 had an annualized failure rate of 1.17%.

But What About Drive Count?

Some of you may be wondering where “drive count” fits into this formula? It doesn’t, and that bothers some folks. After all, wouldn’t it be easier to calculate the AFR as:

AFR = (Drive Failures / Drive Count) * (366 / days in period of observation) * 100

Let’s go back to our example in the previous paragraph. There were 6,000 hard drives in operation at the end of the period of observation; doing the math:

AFR = (28 / 6000) * (366 / 183) * 100 = (0.00467) * (2) * 100 = 0.93%

Using the drive count method, model BB007 had a failure rate of 0.93%. The reason for the difference is that Backblaze is constantly adding and subtracting drives. New Backblaze Vaults come online every month; new features like S3 compatibility rapidly increase demand; migration replaces old, low capacity drives with new, higher capacity drives; and sometimes there are cloned and temp drives in the mix. The environment is very dynamic. The drive count on any given day over the period of observation will vary. When using the drive count method, the failure rate is based on the day the drives were counted. In this case, the last day of the period of observation. Using the drive days method, the failure rate is based on the entire period of observation.

In our example, the following table shows the drive count as we added drives over the six month period of observation:

Drive Count Table

When you total up the number of drive days, you get 878,400, but the drive count at the end of the period of observation is 6,000. The drive days formula responds to the change in the number of drives over the period of observation, while the drive count formula responds only to the count at the end.

The failure rate of 0.93% from the drive count formula is significantly lower, which is nice if you are a drive manufacturer, but not correct for how drives are actually integrated and used in our environment. That’s why Backblaze chooses to use the drive days method as it better fits the reality of how our business operates.

Predictions from Q1 2019

In the Q1 2019 Hard Drive Stats review we made a few hard drive-related predictions of things that would happen by the end of 2019. Let’s see how we did.

Prediction: Backblaze will continue to migrate out 4TB drives and will have fewer than 15,000 by the end of 2019: we currently have about 35,000.

  • Reality: 4TB drive count as of December 31, 2019: 34,908.
  • Review: We were too busy adding drives to migrate any.

Prediction: We will have installed at least twenty 20TB drives for testing purposes.

  • Reality: We have zero 20TB drives.
  • Review: We have not been offered any 20TB drives to test or otherwise.

Prediction: Backblaze will go over one exabyte (1,000 petabytes) of available cloud storage. We are currently at about 850 petabytes of available storage.

Prediction: We will have installed, for testing purposes, at least 1 HAMR based drive from Seagate and/or 1 MAMR drive from Western Digital.

  • Reality: Not a sniff of HAMR or MAMR drives.
  • Review: Hopefully by the end of 2020.

In summary, I think I’ll go back to my hard drive statistics and leave the prognosticating to soothsayers and divining rods.

Lifetime Hard Drive Stats

The table below shows the lifetime failure rates for the hard drive models we had in service as of March 31, 2020. The reporting period is from April 2013 through December 31, 2019. All of the drives listed were installed during this timeframe.

Backblaze Lifetime Hard Drive Failure Rates

The Hard Drive Stats Data

The complete data set used to create the information used in this review is available on our Hard Drive Test Data webpage. You can download and use this data for free for your own purpose. All we ask are three things: 1) You cite Backblaze as the source if you use the data, 2) You accept that you are solely responsible for how you use the data, and 3) You do not sell this data to anyone—it is free.

If you just want the summarized data used to create the tables and charts in this blog post you can download the ZIP file containing the MS Excel spreadsheet.

Good luck and let us know if you find anything interesting.

The post Backblaze Hard Drive Stats Q1 2020 appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Backblaze B2 Cloud Storage Now Has S3 Compatible APIs

Post Syndicated from Gleb Budman original https://www.backblaze.com/blog/backblaze-b2-s3-compatible-api/

Backblaze S3 Compatible APIs

In 2015, we kept hearing the same request. It went something like: “I love your computer backup service, but I also need a place to store data for other reasons—backing up servers, hosting files, and building applications. Can you give me direct access to your storage?” We listened, and we built Backblaze B2 Cloud Storage.

To build on my own words from the time, “It. Was. HUGE.” B2 Cloud Storage fundamentally changed the trajectory of our company. Over just the past two years, we’ve added more customer data than we did in our entire first decade. We now have customers in over 160 countries and they’ve entrusted us with more than an exabyte of data.

Brands like American Public Television, Patagonia, and Verizon’s Complex Networks—alongside more than 100,000 other customers—use Backblaze B2 to back up and archive their data; host their files online; offload their NAS, SAN, and other storage systems; replace their aging tape infrastructure; and as the store for the applications they’ve built. Many of them have told us how the low cost enabled them to do what they wouldn’t have been able to, and the simplicity to do it quickly.

I’m proud that we’ve been able to help customers by offering the most affordable download prices in the industry, making it easy to migrate from the cloud oligarchy, and offering unprecedented choice in how our customers use their data.

Today, I’m thrilled to announce the public beta launch of our most requested feature: S3 Compatible APIs for B2 Cloud Storage.

If you have a B2 Cloud Storage account, you can start immediately.

If you need one, create an account today.
It’s free, easy and you’ll get 10GB of free storage as well.

If you have wanted to use B2 Cloud Storage but your application or device didn’t support it, or you wanted to move to cloud storage in general but couldn’t afford it, you should be able to instantly start with the new S3 Compatible APIs for Backblaze B2.

Welcome to Backblaze B2 via our S3 Compatible APIs

In practical terms, this launch means that you can now instantly plug into the Backblaze B2 Cloud Storage service by doing little more than pointing your data to a new destination. There’s no need to write new code, no change in workflow, and no downtime. Our S3 Compatible APIs are ready to use, and our solutions engineering team is ready to help you get started.

But they’re not alone. We have a number of launch partners leveraging our S3 Compatible APIs so you can use B2 Cloud Storage. These partners include IBM Aspera, Quantum, and Veeam.

Official Launch Partners

Cinnafilm, IBM Aspera, Igneous, LucidLink, Marquis, Masstech, Primestream, Quantum, Scale Logic, Storage Made Easy, Studio Network Solutions, Veeam, Venera Technologies, Vidispine, Xendata. These companies join a list of more than 100 other software, hardware, and cloud companies already offering Backblaze B2 to support their customers’ cloud storage needs.

Challenging the Cloud Oligarchy

For too long, cloud storage has been an oligarchy, leaving customers with choices that all look the same: expensive, opaque, complicated. We pride ourselves on being simple, reliable, and affordable. As Inc. magazine put it recently, B2 Cloud Storage is “Everything Amazon’s AWS Isn’t.

With B2 Cloud Storage S3 Compatible APIs, we’re making the choice for something different a no-brainer. While there are all new ways to use Backblaze B2, our pricing is unchanged. We remain the affordability leader, and our prices have nothing to hide. With Backblaze, customers don’t need to choose what data they actually use—all data is instantly available.

While we’re opening this new pathway to B2 Cloud Storage, it doesn’t mean we’re changing anything about the service you already love. Our Native APIs are still performant, elegant, and supported. We’re just paving the way for more of you to see what it feels like when a cloud storage solution works for you.

We hope you like it.

The post Backblaze B2 Cloud Storage Now Has S3 Compatible APIs appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

How Hackers Can Help: Backblaze and the Ethical Hackers on HackerOne

Post Syndicated from Ramya Ramamoorthy original https://www.backblaze.com/blog/how-hackers-can-help-backblaze-and-the-ethical-hackers-on-hackerone/

A white hat hacker.

Backblaze is responsible for a huge amount of customer, company, and employee data—in fact, we recently announced that we have more than an exabyte under management. With a huge amount of data, however, comes a huge amount of responsibility to protect that data.

This is why our security team works tirelessly to protect our systems. One of the ways in which they safeguard the data we’ve been entrusted with is by working alongside hackers. But these aren’t just any hackers…

Sometimes Hackers Can Be Good

Although it may sound odd at first, hackers have helped us discover and resolve a number of issues in our systems. This is thanks to Backblaze’s collaboration with HackerOne, a platform that crowdsources white hat hackers to test products and alert companies of security issues when they find them. In return, companies award the hacker a bounty. Bounty amounts are carefully outlined based on severity ratings, so when a vulnerability is discovered, it is awarded based on that bounty structure.

Backblaze + HackerOne

Tim Nufire, Chief Security and Cloud Officer at Backblaze, created the company’s HackerOne program back in March 2015. One of the best things a company of our size can do is incentivize hackers around the world to look at our site to help ensure a secure environment. We can’t afford to onboard several hundred hackers as full-time employees, so by running a program like this, we are leveraging the talent of a very broad, diverse group of researchers, all of whom believe in security and are willing to test our systems in an effort to earn a bounty.

How HackerOne Works

When a hacker finds an issue on our site, backup clients, or any other public endpoint, they file a ticket with our security team. The team reviews the ticket and once they have triaged and confirmed that it is a real issue, they pay the hacker a bounty which depends on the severity of the find. The team then files an internal ticket with the engineering team. Once the engineers fix the issue, the security team will check to make sure that the problem was resolved.

To be extra cautious, the team gets a second set of eyes on the issue by asking the hacker to ensure that the vulnerability no longer exists. Once they agree everything is correct and give us the green light, the issue is closed. If you’re interested in learning even more about this process, check out Backblaze’s public bounty page, which offers even more information on our response efficiency, program statistics, policies, and bounty structure.

Moving from Private to Public

Initially, our program was private, which meant that we only worked with hackers we invited into our program. But in April 2019, our team took the program public. This meant that anyone could join our HackerOne program, find security issues, and earn a bounty.

The reasoning behind our decision to make the program public was simple: the more people we encourage to hack our site, the faster we can find and fix problems. And that’s exactly what happened at Backblaze. Thanks to the good guys on HackerOne we are one step ahead of the bad guys.

Some Issues We Resolve, Some We Contest

Let’s take a look at some examples as we work through two ‘classes’ of bugs typically reported by hackers.

One class of bugs that hackers find is Cross-Site Request Forgery (CSRF) attacks. CSRF attacks attempt to trick users into making unwanted changes such as disabling security features on a website they’re logged into. CSRF attacks specifically target a user’s settings, not their data, since the attacker has no way to see the response to the malicious request. To resolve issues like this, we make changes like adding the SameSite attribute to web pages, among other techniques. Problem solved!

But sometimes making changes on our end isn’t the right response. Another class of “vulnerabilities” that hackers are quick to point out is “Information Disclosure” related to software versions or other system components. However, Backblaze does not see this as a vulnerability. “Security through obscurity” is not good security, so we intentionally make information like this visible and encourage hackers to use it to find holes in our defenses. It’s our belief that, by being as open with our community as possible, we’re more secure than we would be by hiding details about how our systems are configured.

We call attention to these two examples specifically because they underline one of the most interesting aspects of working with HackerOne: deciding when something is truly an issue that needs fixing, and when it is not.

Help Us Decide!

HackerOne has proven to be a great resource to scale our security efforts, but we’re missing one thing: a capable new team member to lead this program at Backblaze! Yes, we are hiring an Application Security Manager.

Among other interesting tasks, whoever fills the role will be responsible for triaging and prioritizing the issues identified through the HackerOne platform. This is a new role for us which was identified as a must-have by our security team because Backblaze is growing quickly.

Security has been our top priority since day one, but as our company scales and the amount of data that we store increases, we need someone who can help us navigate that growth. As Tim Nufire points out, “Growth makes us a bigger target, so we need a stronger defense.”

The Application Security Manager will not only have the opportunity to apply their security knowledge, but they will also have the unique chance to shape a security team at a growing, successful, and sustainable company. We think that sounds pretty exciting.

Who We Are Looking For

If you are someone who has years of experience in the security field, but hasn’t had the chance to take charge and lead a team, then this is your opportunity! We are looking for someone who is an expert in application layer security and is willing to teach us what we don’t know.

We need someone who is not afraid to roll up their sleeves and get to work even when there is no clear direction given. There are a couple of technical skills that we hope the new hire would have (like Burp Suite), but the most important qualities are being hands-on and having organizational management skills. This is because the Application Security Manager will formulate strategy and build a roadmap for the team moving forward. If that excites you as much as it excites us, feel free to send your resume to jobscontact@backblaze.com or apply online here.

And of course, we are always looking for more white hat hackers to test our site. If you can’t join us in the office, then join us on HackerOne to help discover and resolve potential vulnerabilities. We look forward to hearing from you!

The post How Hackers Can Help: Backblaze and the Ethical Hackers on HackerOne appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.