Tag Archives: Cloud Storage

Move Even Your Largest Archives to B2 with Fireball and Archiware P5

Post Syndicated from Skip Levens original https://www.backblaze.com/blog/archiware-p5-cloud-backup/

Archiware P5 and Fireball

Backblaze B2’s reliability, scalability, and affordable, “pay only for what you use” pricing means that it’s an increasingly popular storage option for all phases of content production, and that’s especially true for media archiving.

By shifting storage to B2, you can phase out hard-to-manage and expensive local backup storage and clear space on your primary storage. Having all of your content in a single place — and instantly available — can transform your production and keep you focused on the creative process.

Fireball Rapid Ingest to Speed Your First Migration to Backblaze B2

Once you sign up for Backblaze B2, one tool that can speed an initial content migration tremendously is Backblaze’s Fireball rapid ingest service. As part of the service, Backblaze ships you a 70TB storage system. You then copy over all the content that you want in B2 to the Fireball system: all at local network speeds. Once the system is shipped to Backblaze, it’s quickly moved to your B2 account, a process far faster than uploading those files over the internet.

Setting Up Your Media Archive

Since manually moving files to archive and backing up project folders can be very time-consuming, many customers choose software like Archiware P5 that can manage this automatically. In P5’s interface you can choose files to add to archive libraries, restore individual files to your local storage from B2, and even browse all of your archive content on B2 with thumbnail previews, and more.

However, many media and entertainment customers have terabytes and terabytes of content in “archive” — that is, project files and content not needed for a current production, but necessary to keep nearby, ready to pull into a new production.

They’d love to get that content into their Backblaze B2 account and then manage it with an archive, sync, backup solution like Archiware P5. But the challenge facing too many is how to get all these terabytes up to B2 through the existing bandwidth in the office. Once the large, initial archive is loaded, the incrementals aren’t a problem, but getting years of backlog pushed up efficiently is.

For anyone facing that challenge, we’re pleased to announce the Archiware P5 Fireball Integration. Our joint solution provides any customer with an easy way to get all of their archives loaded into their B2 account without having to worry about bandwidth bottlenecks.

Archiware P5 Fireball Integration

A backup and archive manager like Archiware P5 is a great way to get your workflow under control and automated while ensuring that your content is safely and reliably stored. By moving your archives offsite, you get the highest levels of data protection while keeping your data immediately available for use anytime, anywhere.

With the newest release, Archiware P5 can archive directly to Fireball at fast, local network speeds. Then, once your Fireball content has been uploaded to your Backblaze account, a few clicks are all that is needed to point Archiware at your Backblaze account as the new location of your archive.

Finally, you can clear out those closets of hard drives and tape sets!

Archiware P5 to B2 workflow

Archiware P5 can now archive directly to Fireball at local network speeds, which are then linked to their new locations in your B2 accounts. With a few clicks you can get your entire archive uploaded to the B2 cloud without suffering any downtime or bandwidth issues.

For detailed information about configuring Archiware to archive directly to Fireball:

For more information about Backblaze B2 Fireball Rapid Ingest Service:

Archiware on Synology and QNAP NAS Devices

Archiware, NAS and B2

Archiware P5 can also now run directly on several Synology, QNAP, and G-Tech NAS systems to archive and move content to your Backblaze B2 account over the internet

With their most recent releases Archiware now supports several NAS system devices from QNAP, Synology, and G-Tech as P5 clients or servers.

The P5 software is installed as an application from the NAS vendor’s app store and runs directly on the NAS system itself without having to install additional hardware.

This means that all of your offices or departments with these NAS systems can now fully participate in your sync, archive, and backup workflows, and each of them can archive off to your central Backblaze B2 account.

For more information:

Archiware plus Backblaze: A Complete Front-to-Back Media Solution

Archiware P5, Fireball, and Backblaze B2 are all important parts of a great backup, archive, and sync plan. By getting all of your content into archive and B2, you’ll know that it’s highly protected, instantly available for new production workflows, and also readily discoverable through thumbnail and search capability.

With the latest version of P5, you not only have your entire production and backup workflows managed, with Fireball you can get even the largest and hardest to move archive safely and quickly into B2, as well!

For more information about the P5 Software Suite: Archiware P5 Software Suite

And to order a Fireball as part of our Rapid Ingest Service, start here: Backblaze B2 Fireball


You might also be interested in reading our recent guest post written by Marc N. Batschkus of Archiware about how to save time, money, and gain peace of mind with an archive solution that combines Backblaze B2 and Archiware P5.

Creating a Media Archive Solution with Backblaze B2 and Archiware P5

 

The post Move Even Your Largest Archives to B2 with Fireball and Archiware P5 appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

iconik and Backblaze — The Cloud Production Solution You’ve Always Wanted

Post Syndicated from Skip Levens original https://www.backblaze.com/blog/iconik-and-backblaze-cloud-production-solution/

Cantemo iconik Plus Backblaze B2 for Media Cloud Production

Cantemo iconik Plus Backblaze B2 for Media Cloud Production

Many of our customers are archiving media assets in Backblaze B2, from long-running television productions, media distributors, AR/VR video creators, corporate video producers, houses of worship, and many more.

They are emptying their closets of USB hard drives, clearing off RAID arrays, and migrating LTO tapes to cloud storage. B2 has been proven to be the least expensive storage for their media archives, while keeping the archives online and accessible. Gone are the days of Post-its, clipboards, and cryptic drive labels defining whether old video footage can be found or not. Migrating archives from one form of storage to another will no longer suck up weeks and weeks of time.

So now that their archives are limitless, secure, always active, and available, the next step is making them actionable.

Our customers have been asking us — how can I search across all of my archives? Can I preview clips before I download the hi-res master, or share portions of the archive with collaborators around the world? Why not use the latest AI tools to intelligently tag my footage with metadata?

To meet all of those needs and more, we are excited to announce that Cantemo’s iconik cloud media management service now officially supports Backblaze B2.

iconik — A Media Management Service

iconik is an affordable and simple-to-use media management service that can read a Backblaze B2 bucket full of media and make it actionable. Your media assets are findable, sortable with full previews, and ready to pull into a new project or even right into your editor, such as Adobe Premiere, instantly.

Cantemo iconik user interface

iconik — Cantemo’s new media management service with AI features to find, sort, and even suggest assets for your project across your entire library

As a true media management service, iconik’s pricing model is a pay-as-you-go service, transparently priced per-user, per month. There are no minimum purchases, no servers to buy, and no large licensing fees to pay. To use iconik, all your users need is a web browser.

iconik Pricing

To get an idea of what “priced-per-user” might look like, most organizations will need at least one administrative user ($89/month), standard users ($49/month) who can organize content, create workflows, and ingest new media, and browse-only users ($19/month), who can search and download what they need. There’s also a “share-only” level that has no monthly charge that lets you incorporate customer and reviewer comments. This should accommodate teams of all kinds and all sizes.

Best of all, iconik is intelligent about how it uses storage, and while iconik charges small consumption fees for proxy storage, bandwidth, etc., they have found that for customers that bring media from Backblaze B2 buckets, consumption charges should be less than 5% of the monthly bill for user licenses.

As part of their launch promotion, if you get started in October, Cantemo will give Backblaze customers a $300 getting started credit!

You can sign up and get started here using the offer code of BBB22018.

Everwell’s Experience with iconik and Backblaze

One of the first customers to adopt iconik with Backblaze is Everwell, a video production company. Everwell creates a constant stream of videos for medical professionals to show in their waiting rooms. Rather than continuously buying upgrades for their in-house asset management system and local storage, iconik allows Everwell to shift their production to the cloud for all of their users. Their new solution will allow Everwell to manage their growing library of videos as new content constantly comes online, and kick off longer form productions with full access to all the assets they need across a fast-moving team that can be anywhere their production takes them.

collage of Everwll video images

Everwell is a fast-growing medical content developer for healthcare givers

To speed up their deployment of iconik, Everwell started with Backblaze’s data ingestion service, Fireball. Everwell copied their content to Fireball, and once back in the Backblaze data center, the data from Fireball was quickly added directly to Everwell’s B2 buckets. iconik could immediately start ingesting the content in place and make it available to every user.

Learn more about Backblaze B2 Fireball

With iconik and Backblaze, Everwell dramatically simplified their workflow as well, collapsing several critical workflow steps into one. For example, by uploading source files to Backblaze B2 as soon as they’re shot, Everwell not only reduces the need to stage local production storage at every site, they ingest and archive in a single step. Every user can immediately start work on their part of the project.

“The ‘everyone in the same production building’ model didn’t work for us any longer as our content service grew, with more editors and producers checking in content from remote locations that our entire team needed to use immediately. With iconik and Backblaze, we have what feels like the modern cloud-delivered production tool we’ve always wanted.”

— Loren Goldfarb, COO, Everwell

See iconik in Action at NAB NYC October 17-18

NAB Show New York - Media In Action October 17-18 2018

Backblaze is at NAB New York. Meet us there!

We’re excited to bring you several chances to see iconik and Backblaze working together.

The first is the NAB New York show, held October 17-18 at the Javits Center. iconik will be shown by Professional Video Technology in Booth N1432, directly behind Backblaze, Booth N1333.

Have you signed up for NAB NY yet? You can still receive a free exhibits pass by entering Backblaze’s Guest Code NY8842.

And be sure to sign up to meet with the Backblaze team at NAB by signing up on our calendar.

Attend the iconik and B2 Webinar on November 20

Soon after NAB NY, Backblaze and iconik will host a webinar to demo the solution called “3 Steps to Making Your Cloud Media Archive ‘active’ With iconik and Backblaze B2.” The webinar will be presented on November 20 and available on demand after November 20. Be sure to sign up for that too!

3 Steps Demo with: iconik and Backblaze B2 Cloud Storage

Sign up for the iconik/B2 Webinar

Don’t Miss the iconik October Launch Promotion

The demand for creative content is growing exponentially, putting more demands on your creative team. With iconik and B2, you can make all of your media instantly accessible within your workflows while adopting a infinitely scalable, pay only for what you use, storage solution.

To take advantage of the iconik October launch promotion and receive $300 free credit with iconik, sign up using the BBB22018 code.

The post iconik and Backblaze — The Cloud Production Solution You’ve Always Wanted appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Backblaze and Cloudflare Partner to Provide Free Data Transfer

Post Syndicated from Gleb Budman original https://www.backblaze.com/blog/backblaze-and-cloudflare-partner-to-provide-free-data-transfer/

 Backblaze B2 Free Data Transfer to Cloudflare

Today we are announcing that beginning immediately, Backblaze B2 customers will be able to download data stored in B2 to Cloudflare for zero transfer fees. This happens automatically once Cloudflare is configured to distribute your B2 files. This means that Backblaze B2 can now be used as an origin store for the Cloudflare CDN and edge network, providing customers enhanced performance and access to their content stored on B2. The result is that customers can save up to 75% on storage versus Amazon S3 to store their content in the cloud and deliver it worldwide.

The zero B2 transfer fees are available to all Cloudflare customers using any plan. Cloudflare customers can also use paid add-ons such as Argo and Workers to enhance the routing and security of the B2 files being delivered over the Cloudflare CDN. To implement this service, Backblaze and Cloudflare have directly connected, thereby allowing near-instant data transfers from B2 to Cloudflare.

Backblaze has prepared a guide on “Using Backblaze B2 storage with Cloudflare.” This guide provides step-by-step instructions on how to set up Backblaze B2 with Cloudflare to take advantage of this program.

The Bandwidth Alliance

The driving force behind the free transfer program is the Bandwidth Alliance. Backblaze and Cloudflare are two of the founding members of this group of forward-thinking cloud and networking companies that are committed to providing the best and most cost-efficient experience for our mutual customers. Additional founding members of the Bandwidth Alliance include Automattic (WordPress), DigitalOcean, IBM Cloud, Microsoft Azure, Packet, and other leading cloud and networking companies.

How Companies Can Leverage the Bandwidth Alliance

Below are examples of how Bandwidth Alliance partners can work together to save customers on their data transfer fees.

Hosting Website Assets

Whether you are a professional webmaster or just run a few homegrown sites, you’ve lived the frustration of having a slow website. Over the past few years these challenges have become more acute as video and other types of rich media have become core to the website experience. This new content has also translated to higher storage and bandwidth costs. That’s where Backblaze B2 and Cloudflare come in.diagram of zero cost data transfer from Backblaze B2 to Cloudflare CDN

Customers can store their videos, photos, and other assets in Backblaze B2’s pay-as-you-go cloud storage and serve the site with Cloudflare’s CDN and edge services. The result is an amazingly affordable cloud-based solution that dramatically improves web site performance and reliability. And customers pay each service for what they do best.

“I am extremely happy with my experience serving html/css/js and over 17 million images from B2 via Cloudflare Workers. Page load time has been great and costs are minimal.”

— Jacob Hands, Lead Developer, FactorioMaps.com

Media Content Distribution

The ability to download content from B2 cloud storage to the Cloudflare CDN for zero transfer cost is the just the beginning. A company needing to distribute media can now store original assets in Backblaze B2, send them to a compute service to transcode and transmux them, and forward the finished assets to be served up by Cloudflare. Backblaze and Packet previously announced zero transfer fees between Backblaze B2 storage and Packet compute services. This enabled customers to store data in B2 at 1/4th the price of competitive offerings and then process data for transcoding, AI, data analysis, and more inside of Packet without worrying about data transfer fees. Packet is also a member of the Bandwidth Alliance and will deliver content to Cloudflare for zero transfer fees as well.

diagram of zero cost data transfer flow from Backblaze B2 to Packet Compute to Cloudflare CDN

Process Now, Distribute Later

A variation of the example above is for a company to store the originals in B2, transcode and transmux the files in Packet, then put those versions back into B2, and finally serve them up via Cloudflare. All of this is done with zero transfer fees between Backblaze, Packet, and Cloudflare. The result is all originals and transmuxed versions are stored at 1/4th the prices of other storage, and served up efficiently via Cloudflare.diagram of data transfer flow between B2 to Packet back to B2 to Cloudflare

In all cases you would only pay for services you use and not for the cost to move data between those services. This results in a predictable and affordable cost for a given project using industry leading best-of-breed services.

Moving Forward

The members of the Bandwidth Alliance are committed to enabling the best and most cost efficient cloud services when it comes to working with data stored in the cloud. Backblaze has committed to a transfer fee of $0 to move content from B2 to either Cloudflare or Packet. We think that’s a great step in the right direction. And if you are cloud provider, let us know if you’d be interested in taking a step like this one with Backblaze.

The post Backblaze and Cloudflare Partner to Provide Free Data Transfer appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Backblaze B2 API Version 2 Beta is Now Open

Post Syndicated from Andy Klein original https://www.backblaze.com/blog/backblaze-b2-api-version-2-beta-is-now-open/

cloud storage workflow image

Since B2 cloud storage was introduced nearly 3 years ago, we’ve been adding enhancements and new functionality to the B2 API, including capabilities like CORS support and lifecycle rules. Today, we’d like to introduce the beta of version 2 of the B2 API, which formalizes rules on application keys, provides a consistent structure for all API calls returning information about files, and cleans up outdated request parameters and returned data. All version 1 B2 API calls will continue to work as is, so no changes are required to existing integrations and applications.

The API Versions section of the B2 documentation on the Backblaze website provides the details on how the V1 and V2 APIs differ, but in the meantime here’s an overview into the what, why, and how of the V2 API.

What Has Changed Between the B2 Cloud Storage Version 1 and Version 2 APIs?

The most obvious difference between a V1 and V2 API call is the version number in the URL. For example:

https://apiNNN.backblazeb2.com/b2api/v1/b2_create_bucket

https://apiNNN.backblazeb2.com/b2api/v2/b2_create_bucket

In addition, the V2 API call may have different required request parameters and/or required response data. For example, the V2 version of b2_hide_file always returns accountId and bucketId, while V1 returns accountId.

The documentation for each API call will show whether there are any differences between API versions for a given API call.

No Change is Required For V1 Applications

With the introduction of V2 of the B2 API there will be V1 and V2 versions for every B2 API call. All applications using V1 API calls will continue to work with no change in behavior. In some cases, a given V2 API call will be different from its companion V1 API call as noted in the B2 API documentation. For the remaining API calls a given V1 API call and its companion V2 call will be the same, have identical parameters, return the same data, and have the same errors. This provides a B2 developer the flexibility to choose how to upgrade to the V2 API.

Obviously, if you want to use the functionality associated with a V2 API version, then you must use the V2 API call and update your code accordingly.

One last thing: beginning today, if we create a new B2 API call it will be created in the current API version (V2) and most likely will not be created in V1.

Standardizing B2 File Related API Calls

As requested by many B2 developers, the V2 API now uses a consistent structure for all API calls returning information about files. To enable this there are some V2 API calls that return additional fields, for example:

Restricted Application Keys

In August we introduced the ability to create restricted applications keys using the B2 API. This capability allows an account owner the ability to restrict who, how, and when the data in a given bucket can be accessed. This changed the functionality of multiple B2 API calls such that a user could create a restricted application key that could break a 3rd party integration to Backblaze B2. We subsequently updated the affected V1 API calls, so they could continue to work with the existing 3rd party integrations.

The V2 API fully implements the expected behavior when it comes to working with restricted application keys. The V1 API calls continue to operate as before.

Here is an example of how the V1 API and the V2 API will act differently as it relates to restricted application keys.

Set-up

  • The B2 account owner has created 2 public buckets, “Backblaze_123” and “Backblaze_456”
  • The account owner creates a restricted application key that allows the user to read the files in the bucket named “Backblaze_456”
  • The account owner uses the restricted application key in an application that uses the b2_list_buckets API call

In Version 1 of the B2 API

  • Action: The account owner uses the restricted application key (for bucket Backblaze_456) to access/list all the buckets they own (2 public buckets).
  • Result: The results returned are just for Backblaze_456 as the restricted application key is just for that bucket. Data about other buckets is not returned.

While this result may seem appropriate, the data returned did not match the question asked, i.e. list all buckets. V2 of the API ensures the data returned is responsive to the question asked.

In Version 2 of the B2 API

  • Action: The account owner uses the restricted application key (for bucket Backblaze_456) to access/list all the buckets they own (2 public buckets).
  • Result: A “401 unauthorized” error is returned as the request for access to “all” buckets does not match the restricted application key, e.g. bucket Backblaze_456. To achieve the desired result, the account owner can specify the name of the bucket being requested in the API call that matches the restricted application key.

Cleaning up the API

There are a handful of API calls in V2 where we dropped fields that were deprecated in V1 of the B2 API, but were still required. So in V2:

  • b2_authorize_account: The response no longer contains minimumPartSize. Use partSize and absoluteMinimumPartSize instead.
  • b2_list_file_names: The response no longer contains size. Use contentLength instead.
  • b2_list_file_versions: The response no longer contains size. Use contentLength instead.
  • b2_hide_file: The response no longer contains size. Use contentLength instead.

Support for Version 1 of the B2 API

As noted previously, V1 of the B2 API continues to function. There are no plans to stop supporting V1. If at some point in the future we do deprecate the V1 API, we will provide advance notice of at least one year before doing so.

The B2 Java SDK and the B2 Command Tool

Both the B2 Java SDK and the B2 Command Line Tool, do not currently support Version 2 of B2 API. They are being updated and will support the V2 API at the time the V2 API exits Beta and goes GA. Both of these tools, and more, can be found in the Backblaze GitHub repository.

More About the Version 2 Beta Program

We introduced Version 2 of the B2 API as beta so that developers can provide us feedback before V2 goes into production. With every B2 integration being coded differently, we want to hear from as many developers as possible. Give the V2 API a try and if you have any comments you can email our B2 beta team at b2beta@backblaze.com or contact Backblaze B2 support. Thanks.

The post Backblaze B2 API Version 2 Beta is Now Open appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Moving Tape Content to Backblaze Fireball with Canister

Post Syndicated from Skip Levens original https://www.backblaze.com/blog/moving-tape-content-to-cloud-storage/


Canister for Fireball: LTO tape to Backblaze B2 migration made 'drag and drop' easy
If you shoot video on the run and wrangle video from multiple sources, you know that reliably offloading files from your camera carts, storage cards, or pluggable SSDs can be a logistical challenge. All of your source files need to be copied over, verified, and backed up before you can begin the rest of your post-production work.  It’s arguably the most critical step in your post-production workflow.

Knowing how critical this step is, videographers and data wranglers alike have long relied on an app for Mac and Windows called Hedge to take charge of their file copy and verification needs.


Hedge source and target progress

Hedge for Mac and Windows — drag and drop source file copy and verify tool

With an intuitive drag and drop interface, Hedge makes it simple to select your cards, disks, or other sources, identify your destination drives, then copy and verify using a custom “Fast Lane” engine to speed transfers dramatically. You can log when copies were completed, and even back up to multiple destinations in the same action, including your local SAN, NAS, or Backblaze Fireball, then on to your Backblaze B2 cloud storage.

But How Do You “Data-Wrangle” Tape Content to the Cloud?

But what if you have content, backup sets, or massive media archives on LTO tape?

You may find yourself in one of these scenarios:

  • You may have “inherited” an older LTO tape system that is having a hard time keeping up with your daily workflow, and you aren’t ready to sign up for more capital expense and support contracts.
  • You may have valuable content “stuck” on tape that you can’t easily access and want it on cloud for content monetization workflows that would overwhelm your tape system.
  • Your existing tape based workflow is working fine for now, but you want to get all of that content into the cloud quickly to get ready for future growth and new customers with a solution similar to Hedge.

While many people decide to move tape workflows to cloud for simple economic reasons, having all of that content securely stored in the cloud means that the individual files and entire folders can be instantly pulled into workflows and directly shared from Backblaze B2 with no need for copying, moving, restoring, or waiting.

For more information about how Backblaze B2 can replace LTO solutions, including an LTO calculator:  Backblaze LTO Replacement Calculator

Whichever scenario fits your need, getting tape content into the cloud involves moving a lot of content at once, and in a perfect world it would be as easy to drag and drop that content from tape to Backblaze B2!

Meet Canister for Fireball

To meet this exact need the team that developed Hedge have created an “LTO tape content to Fireball” solution called Canister for Fireball.

Fireball is Backblaze’s solution to help you quickly get massive amounts of data into Backblaze B2 Cloud Storage. When you sign up for the service, Backblaze sends you a 70TB Fireball that is yours to use for 30 days. Simply attach it to your local network and copy content over to the device at the speed of your local network. You’re free to fill up and send in your Fireball device as many times as needed. When Backblaze receives your Fireball with your files, all of the content is ingested directly into Backblaze’s data centers and appears in your Backblaze B2 online storage.

Backblaze B2 Fireball Rapid Ingest Service

Canister for Fireball makes it incredibly easy to move your content and archives from your tape device to your Backblaze B2 Fireball. With an intuitive interface similar to Hedge, Canister copies over and verifies files read from your tapes.

Using Canister with B2

flow chart for moving data from tape to the cloudInsert LTO tapes in your tape system and Canister for Backblaze will move them to your Backblaze B2 Fireball for rapid ingest into your B2 Cloud Storage


Cannister to Fireball user interfaceSelect from any tape devices with LTO media…

Cannister data progression screenshot…and watch the files on the tape copy and verify to your Backblaze B2 Fireball

Here’s how the solution works:

Steps to Migrate Your LTO Content to the Cloud with Canister for Fireball

  1. Order a Fireball system: As part of the signup step you will choose a B2 bucket that you’d like your Fireball content moved to.
  2. Connect your Fireball system to your network, making sure that the workstation that connects to your tape device can also mount the storage volume presented by your Backblaze Fireball.
  3. Install Canister for Fireball on your Mac workstation.
  4. Connect your tape device. Any tape system that can read your tapes and mount them as an LTFS volume will work. Canister will automatically mount tapes inside the app for you.
  5. Launch Canister for Fireball. You can now select the tape device volume as your source, the Fireball as your target, and copy the files over to your Fireball.
  6. Repeat as needed until you have copied and verified all of your tapes securely to your Fireball. You can fill and send in your Fireball as many times as needed during your 30 day period. (And you can always extend your loaner period.)
LTFS or Linear Tape File System is an industry adopted way to make the contents of an entire tape cartridge available as if it were a single volume of files. Typically, the tape stores a list of the files and their location on that tape in the beginning, or header of the tape. When a tape is read into your tape device, that directory section is read in and the tape system then presents it to you as a volume of files and folders. Say you want to select an individual file from that LTFS volume to copy to your desktop. When you move that to your desktop, the tape spools out to wherever that file is stored, reads the entire stream of tape containing that file, then finally copies it to your desktop. It can be a very slow process indeed and why many people choose to store content in cloud storage like Backblaze B2 so that they get instant access to every file.

Now — Put Your LTO Tape Ingest Plan Into Action

If you have content on tape that needs to get into your Backblaze B2 storage, Canister for Fireball and a Backblaze B2 Fireball are the perfect solution.

Canister for Fireball can be licensed for 30 days of use for $99 and includes priority support. The full version is $199. If you decide to upgrade from the 30 Day license you’ll pay only the difference to the full version.

Get more information about Canister for Fireball

And of course, make sure that you’ve ordered your Fireball:

Order a Backblaze B2 Fireball

Now with your content and archives no longer “trapped” on tape, you can browse them in your asset manager, share links directly from Backblaze B2, and have your content ready to pull into new content creation workflows by your team located anywhere in the world.

The post Moving Tape Content to Backblaze Fireball with Canister appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

LTO versus Cloud Storage: Choosing the Model That Fits Your Business

Post Syndicated from Andy Klein original https://www.backblaze.com/blog/lto-vs-cloud-storage-vs-hybrid/

Choose Your Solution: Cloud Storage, LTO, Hybrid Cloud Storage/LTO

Years ago, when I did systems administration for a small company, we used RAID 1 for in-house data redundancy and an LTO tape setup for offsite data backup. Yes, the LTO cataloging and versioning were a pain, so was managing the tapes, and sometimes a tape would be unreadable, but the setup worked. And given there were few affordable alternatives out there at the time you lived and died with your tapes.

Over the last few years, cloud storage has emerged as a viable alternative to using LTO for offsite backups. Improvements in network speed coupled with lower costs are a couple of the factors that have changed the calculus of cloud storage. To see if enough has changed to make cloud storage a viable competitor to LTO, we’ll start by comparing the current and ongoing cost of LTO versus cloud storage and then dig into assumptions underlying the cost model. We’ll finish up by reviewing the pluses and minuses of three potential outcomes: switching to cloud storage, staying with LTO, or using a hybrid LTO/cloud storage solution.

Comparing the Cost of LTO Versus Cloud Storage

Cost calculators for comparing LTO to Cloud Storage have a tendency to be very simple or very complex. The simple ones generally compare hardware and tape costs to cloud storage costs and neglect things like personnel costs, maintenance costs, and so on. In the complex models you might see references to the cost of capital, interest on leasing equipment, depreciation, and the tax implications of buying equipment versus paying for a monthly subscription service.

The Backblaze LTO vs Cloud calculator is somewhere in between. The underlying model takes into account many factors, which we’ll get into in a moment, but if you are a Fortune 500 company with a warehouse full of tape robots, this model is not for you.

Calculator: LTO vs B2

To use the Backblaze calculator you enter:

  1. the amount of Existing Data you have on LTO tape
  2. the amount of data you expect to add in a given year
  3. the amount of incremental data you backup each day

Then you can use the slider to compare your total cost from 1 to 10 years. You can run the model as many times as you like under different scenarios.

Assumptions Behind the Model

To see the assumptions that were made in creating the model, start on the LTO Replacement page and scroll down past the LTO vs. B2 calculator. Click on the following text which will display the “Cost and Operational Assumptions” page.

+ See details on Cost and Operational Assumptions

Let’s take a few minutes to review some of the most relevant points and how they affect the cost numbers reported:

  • LTO Backup Model: We used the Grandfather-Father-Son (GFS) model. There are several others, but this was the most prevalent. If you use the “Tower of Hanoi” model for example, it uses fewer tapes and would lower the cost of the total LTO cost by some amount.
  • Data Compression: We assumed a 2-1 compression ratio for the data stored on the LTO tapes. If your data is principally video or photos, you will most likely not use compression. As such, film studios and post-production houses will need to double the cost of the total LTO solution to compensate for the increased number of tapes, the increased number of LTO tape units, and increased personnel costs.
  • Data Retention: We used a 30 day retention period as this is common in the GFS model. If you keep your incremental tapes/data for 2 weeks, then you would lower the number of tapes needed for incremental backups, but you would also lower the amount of incremental data you keep in the cloud storage system.
  • Tape Units: There are a wide variety of LTO tape systems. You can increase or decrease the total LTO cost based on the systems you are using. For example, you are considering the purchase of an LTO tape system which reads/writes up to 5 tapes simultaneously. That system is more expensive and has higher maintenance costs, but it also would mean you would have to purchase fewer tape units.
  • LTO-8 Tape Units: We used LTO-8 tape units as they are the currently available LTO system most likely to be around in 10 years.
  • Tape Migration: We made no provision for migration from an unsupported LTO version to a supported LTO version. During the next 10 years, many users with older LTO systems will find it likely they will have to migrate to newer systems as LTO only supports 2 generations back and is currently offering a new generation every 2 years.
  • Pickup Cost: The cost of having your tapes picked up so they are offsite. This cost can vary widely based on geography and service level. Our assumption of the cost is $60 per week or $3,120/year. You can adjust the LTO total cost according to your particular circumstances.
  • Network Cost: Using cloud storage requires that you have a reasonable amount of network bandwidth available. The number we used is incremental to your existing monthly cost for bandwidth. Network costs vary widely, so depending on your circumstance you can increase or decrease to the total cost of the cloud storage solution.
  • Personnel Cost: This is the total cost of what you are paying someone to manage and operate your LTO system. This raises or lowers the cost of both the LTO and cloud storage solutions at the same rate, so adjusting this number doesn’t affect the comparison, just the total values for each.
  • Time Savings Versus LTO: With a cloud storage solution, there are no tapes or tape machines to deal with. This saves a significant amount of time for the person managing the backup process. Increasing this value will increase the cost of the cloud storage solution relative to the LTO solution.

As hinted at earlier, we don’t consider the cost of capital, depreciation, etc. in our calculations. The general model is that a company purchases a number of LTO systems and the cost is spread over a 10 year period. After 10 years a replacement unit is purchased. Other items such as tapes and equipment maintenance are purchased and expensed as needed.

Choosing a Data Backup Model

We noted earlier the three potential outcomes when evaluating LTO versus cloud storage for data backup: switching to cloud storage, staying with LTO, or using a hybrid LTO/cloud storage solution. Here’s a look at each.

Switching to Cloud Storage

After using the calculator you find cloud storage is less expensive for your business or organization versus LTO. You don’t have a large amount of existing data, 100 terabytes for example, and you’d rather get out of the tape business entirely.

Your first challenge is to move your existing data to the cloud — quickly. One solution is the Backblaze B2 Fireball data transfer service. You can move up to 70 TB of data each trip from your location to Backblaze in days. This saves your bandwidth and saves time as well.

As the existing data is being transferred to Backblaze, you’ll want to select a product or service to move your daily generated information to the cloud on a regular basis. Backblaze has a number of integration partners that perform data backup services to Backblaze B2

Staying with LTO

After using the calculator you find cloud storage is less expensive, but you are one of those unlucky companies that can’t get reasonably priced bandwidth in their area. Or perhaps, the new LTO-8 equipment you ordered arrived minutes before you read this blog post. Regardless, you are destined to use LTO for at least a while longer. Tried and true, LTO does work and has the added benefit of making the person who manages the LTO setup nearly indispensable. Still, when you are ready, you can look at moving to the hybrid model described next.

Hybrid LTO/Cloud Storage model

In practice, many organizations that use LTO for backup and archive often store some data in the cloud as well, even if haphazardly. For our purposes, Hybrid LTO/Cloud Storage is defined as one of the following:

  1. Date Hybrid: All backups and archives from prior to the cut over date remain stored in LTO; everything after the cut over date date forward is stored in cloud storage.
  2. Classic Hybrid: All of the incremental backups are stored in cloud storage and all full backups and archives are stored on LTO.
  3. Type Hybrid: All data of a given type, say employee data, is stored on LTO, while all customer data is stored in cloud storage. We see this hybrid use case occur as a function of convenience and occasionally compliance, although some regulatory requirements such as GDPR may not be accommodated by LTO solutions.

You can imagine there being other splits, but in essence, there may be situations where keeping the legacy system going in some capacity for some period of time is the prudent business option.

If you have a large tape library, it can be almost paralyzing to think about moving to the cloud, even if it is less expensive. Being open to the hybrid LTO/cloud model is a way to break the task down into manageable steps. For example, solutions like Starwind VTL and Archiware P5 allow you to start backing up to the cloud with minimal changes to your existing tape-based backup schemes.

Many companies that start down the hybrid road typically begin with moving their daily incremental files to the cloud. This immediately reduces the amount of “tape work” you have to do each day and it has the added benefit of making the files readily available should they need to be restored. Once a company is satisfied that their cloud based backups for their daily incremental files are under control, they can consider whether or not they need to move the rest of their data to the cloud.

Will Cloud Storage Replace LTO?

At some point, the LTO tapes you have will need to be migrated to something else as the equipment to read your old tapes will become outdated, then unsupported, and finally unavailable. Users with LTO 4 and, to some degree, LTO 5 are already feeling this pain. To migrate all of that data from your existing LTO system to LTO version “X,” cloud storage, or something else, will be a monumental task. It is probably a good idea to start planning for that now.

In summary, many people will find that they can now choose cloud storage over LTO as an affordable way to store their data going forward. But, having a hybrid environment of both LTO and cloud storage is not only possible, it is a practical way to reduce your overall backup cost while maximizing your existing LTO investment. The hybrid model creates an improved operational environment and provides a pathway forward should you decide to move exclusively to storing your data in the cloud at some point in the future.

The post LTO versus Cloud Storage: Choosing the Model That Fits Your Business appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

How to Leverage Your Amazon S3 Experience to Code the Backblaze B2 API

Post Syndicated from Andy Klein original https://www.backblaze.com/blog/how-to-code-backblaze-b2-api-interface/

Going from S3 to learning Backblaze B2

We wrote recently about how the Backblaze B2 and Amazon S3 APIs are different. What we neglected to mention was how to bridge those differences so a developer can create a B2 interface if they’ve already coded one for S3. John Matze, Founder of BridgeSTOR, put together his list of things consider when levering your S3 API experience to create a B2 interface. Thanks John.   — Andy
BackBlaze B2 to Amazon S3 Conversion
by John Matze, Founder of BridgeSTOR

BackBlaze B2 Cloud Storage Platform has developed into a real alternative to the Amazon S3 online storage platform with the same redundancy capabilities but at a fraction of the cost.

Sounds great — sign up today!

Wait. If you’re an application developer, it doesn’t come free. The Backblaze REST API is not compatible with Amazon S3 REST API. That is the bad news. The good news — it includes almost the entire set of functionality so converting from S3 to B2 can be done with minimal work once you understand the differences between the two platforms.

This article will help you shortcut the process by describing the differences between B2 and S3.

  1. Endpoints: AWS has a standard endpoint of s3.amazonaws.com which redirects to the region where the bucket is located or you may send requests directly to the bucket by a region endpoint. B2 does not have regions, but does have an initial endpoint called api.blackblazeb2.com. Every application must start by talking to this endpoint. B2 also requires two other endpoints. One for uploading an object and another one for downloading an object. The upload endpoint is generated on demand when uploading an object while the download API is returned during the authentication process and may be saved for download requests.
  1. Host: Unlike Amazon S3, the HTML header requires the host token. If it is not present, B2 will not respond with an error.
  1. JSON: Unlike S3, which uses XML, all B2 calls use JSON. Some API calls require data to be sent on the request. This data must be in JSON and all APIs return JSON as a result. Fortunately, the amount of JSON required is minimal or none at all. We just built a JSON request when required and made a simple JSON parser for returned data.
  1. Authentication: Amazon currently has two major authentication mechanisms with complicated hashing formulas. B2 simply uses the industry standard “HTTP basic auth” algorithm. It takes only a few minutes to get to speed on this algorithm.
  1. Keys: Amazon has the concept of an access key and a secret key. B2 has the equivalent with the access key being your key id (your account id) and the secret key being the application id (returned from the website) that maps to the secret key.
  1. Bucket ID: Unlike S3, almost every B2 API requires a bucket ID. There is a special list bucket call that will display bucket IDs by bucket name. Once you find your bucket name, capture the bucket ID and save it for future API calls.
  1. Head Call: The bottom line — there is none. There is, however, a list_file_names call that can be used to build your own HEAD call. Parse the JSON returned values and create your own HEAD call.
  1. Directory Listings: B2 Directories again have the same functionality as S3, but with a different API format. Again the mapping is easy: marker is startFileName, prefix is prefix, max-keys is maxFileCount and delimiter is delimiter. The big difference is how B2 handles markers. The Amazon S3 nextmarker is literally the next marker to be searched, the B2 nextmarker is the last file name that was searched. This means the next listing will also include the last marker name again. This means your routines must parse out the name or your listing will show the next marker twice. That’s a difference, but not a difficult one.
  1. Uploading an object: Uploading an object in B2 is quite different than S3. S3 just requires you to send the object to an endpoint and they will automatically place the object somewhere in their environment. In the B2 world, you must request a location for the object with an API call and then send the object to the returned location. The first API will send you a temporary key and you can continue to use this key for one hour without generating another, with the caveat that you have to monitor for failures from B2. The B2 environment may become full or some other issue will require you to request another key.
  1. Downloading an Object: Downloading an object in B2 is really easy. There is a download endpoint that is returned during the authentication process and you pass your request to that endpoint. The object is downloaded just like Amazon S3.
  1. Multipart Upload: Finally, multipart upload. The beast in S3 is just as much of a beast in B2. Again the good news is there is a one to one mapping.
    1. Multipart Init: The equivalent initialization returns a fileid. This ID will be used for future calls.
    2. Mulitpart Upload: Similar to uploading an object, you will need to get the API location to place the part. So use the fileid from “a” above and call B2 for the endpoint to place the part. Another difference is the upload also requires the payload to be hashed with a SHA1 algorithm. Once done, simply pass the SHA and the part number to the URL and the part is uploaded. This SHA1 component is equivalent to an etag in the S3 world so save it for later.
    3. Multipart Complete: Like S3, you will have to build a return structure for each part. B2 of course requires this structure to be in JSON but like S3, B2 requires the part number and the SHA1 (etag) for each part.

What Doesn’t Port

We found almost everything we required easily mapped from S3 to B2 except for a few issues. To be fair, BackBlaze is working on the following in future versions.

  1. Copy Object doesn’t exist: This could cause some issues with applications for copying or renaming objects. BridgeSTOR has a workaround for this situation so it wasn’t a big deal for our application.
  2. Directory Objects don’t exist: Unlike Amazon, where an object with that ends with a “/” is considered a directory, this does not port to B2. There is an undocumented object name that B2 applications use called .bzEmpty. Numerous 3rd party applications, including BridgeSTOR, treat an object ending with .bzEmpty as a directory name. This is also important for directory listings described above. If you choose to use this method, you will be required to replace the “.bzEmpty” with a “/.”

In conclusion, you can see the B2 API is different than the Amazon S3, but as far as functionality they are basically the same. For us at first it looked like it was going to be a large task, but once we took the time to understand the differences, porting to B2 was not a major job for our application. We created a S3 to B2 shim in a week followed by a few extra weeks of testing and bug fixes. I hope this document helps in your S3 to B2 conversion.

— John Matze, BridgeSTOR

The post How to Leverage Your Amazon S3 Experience to Code the Backblaze B2 API appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Creating a Media Archive Solution with Backblaze B2 and Archiware P5

Post Syndicated from Skip Levens original https://www.backblaze.com/blog/creating-a-media-archive-solution/

Backblaze B2 Cloud Storage + Archiware P5= 7 Ways to Save

B2 + P5 = 7 Ways to Save Time, Money and Gain Peace of Mind with an Archive Solution of Backblaze B2 and Archiware P5

by Dr. Marc M. Batschkus, Archiware

This week’s guest post comes to us from Marc M. Batschkus of Archiware, who is well-known to media and entertainment customers, and is a trusted authority and frequent speaker and writer on data backup and archiving.

— Editor

Archiving has been around almost forever.

Roman "Archivum"Roman “Archivum” where scrolls were stored for later reference.

The Romans used the word “archivum” for the building that stored scrolls no longer needed for daily work. Since then, files have replaced scrolls, but the process has stayed the same and so today, files that are no longer needed for daily production can be moved to an archive.

Backup and Archive

Backblaze and Archiware complement each other in accomplishing this and we’ll show you how to get the most from this solution. But before we look at the benefits of archiving, let’s take a step back and review the difference between backup and archive.

A backup of your production storage protects your media files by replicating the files to a secondary storage. This is a cyclical process, continually checking for changed and new files, and overwriting files after the specified retention time is reached.

Archiving, on the other hand is a data migration, moving files that are no longer needed for daily production to (long-term) storage, yet keeping them easily retrievable. This way, all completed productions are collected in one place and kept for later reference, compliance, and re-use.

Think of backup as a spare tire, archive as winter tiresThink of BACKUP as a spare tire, in case you need it, and ARCHIVE as a stored set of tires for different needs.

To use an analogy:

  • Think of backup as the spare tire in the trunk.
  • Think of archive as the winter tires in the garage.

Both are needed!

Editor’s note: For more insight on “backup vs archive” have a look at What’s the Diff: Backup vs Archive.

Building a Media Archive Solution with Archiware P5 and Backblaze B2

Now that the difference between backup and archive is clear, let’s have a look at what an archive can do to make your life easier.

Archiware archive catalog transfering to B2 cloud storageArchiware P5 can be your interface to locate and manage your files, with Backblaze B2 as your ready storage for all of those files

P5 Archive connects to Backblaze B2 and offers the interface for locating files.

B2 + P5 = 7 Ways to Save Time and Money and Gain Peace-of-Mind

  1. Free up expensive production storage
  2. Archive from macOS, Windows, and Linux
  3. Browse and search the archive catalog with thumbnails and proxies
  4. Re-use, re-purpose, reference and monetize files
  5. Customize the metadata schema to fit your needs and speed up search
  6. Reduce backup size and runtime by moving files from production storage
  7. Protect precious assets from local disaster and for the long-term (no further migration/upgrade needed)

Archive as Mini-MAM

The “Mini-MAM” features of Archiware P5 help you to browse and find files easier than ever. Browse the archive visually using the thumbnails and proxy clips in the archive catalog. Search for specific criteria or a combination of criteria such as location or description.

Since P5 Archive lets you easily expand and customize metadata fields and menus, you can build the individual metadata schema that works best for you.

Technical metadata (e.g. camera type, resolution, lens) can be automatically imported from the file header into the metadata fields of P5 archive using a script.

The archive becomes the file memory of the company saving time and energy because now there is only one place to browse and search for files.

Mini MAM screenshotArchiware as “Mini-MAM” —  thumbnails, proxies, even metadata all within Archiware P5

P5 offers maximum flexibility and supports all storage strategies, be it cloud, disk or tape and any combination of the above.

For more information on Archiving with Archiware: Archiving with Archiware P5. For macOS, P5 Archive offers integration with the Finder and Final Cut Pro X via the P5 Archive App. For more information on integrated archiving with Final Cut Pro X: macOS Finder and Final Cut Pro X Integrated Archiving

You can start building an archive immediately with Backblaze B2 cloud storage because it allows you to do this without any additional storage hardware and upfront investment.

Backblaze B2 is the Best of Cloud

  • ✓  Saves investment in storage hardware
  • ✓  Access from anywhere
  • ✓  Storage on demand
  • ✓  Perpetual storage – no migration or upgrade of hardware
  • ✓  Financially advantageous (OPEX vs CAPEX)
  • ✓  Best price in its category

Backblaze B2 offers flexible access so that the archive can be accessed from several physical locations with no storage hardware needing to be moved.

P5 Archive supports consumable files as archive format. This makes the single files accessible even if P5 Archive is not present at the other location. This opens up a whole new world of possibilities for collaborative workflows that were not possible before.

Save Money with OPEX vs CAPEX

CAPEX vs. OPEXCAPital EXpenditures are the money companies spend to purchase major physical goods that will be used for more than one year. Examples in our field are investments in hardware such as storage and servers.

OPerating EXpenses are the costs for a company to run its business operations on a daily basis. Examples are rent and monthly cost for cloud storage like B2.

By using Backblaze B2, companies can save CAPEX and instead have monthly payments only for the cloud storage they use, while also saving maintenance and migration cost. Furthermore, migrating files to B2 makes expansion of high performance and costly production storage unnecessary. Over time this alone will make the archive pay off.

Now that you know how to profit from archiving with Archiware P5 and Backblaze B2, let’s look at the steps to build the best archive for you.

Connecting B2 cloud storage screenshot

Backblaze B2 is already a built-in option in P5 and works with P5 Archive and P5 Backup.

For detailed setup and best practice see:

Cloud Storage Setup and Best Practice for Archiware

Steps in Planning a Media Archive

Depending on the size of the archive, the number of people working with and using it, and the number of files that are archived, planning might be extremely important. Thinking ahead and asking the right questions ensures that the archive later delivers the value that it was built for.

Including people that will configure, operate, and use the system guarantees a high level of acceptance and avoids blind spots in your planning.

  1. Define users: who administers, who uses and who archives?
  2. Decide and select: what goes into the archive, and when?
  3. Which metadata are needed to describe the data needed (what will be searched for)?
  4. Actual security: on what operating system, hardware, software, infrastructure, interfaces, network and medium will be archived?
  5. What security requirements should be fulfilled: off-site storage, duplication, storage duration, test cycles of media, generation migration, etc.
  6. Retrieval:
    • Who searches?
    • With what criteria?
    • Who is allowed to restore?
    • On what storage?
    • For what use?

Metadata is the key to the archive and enables complex searches for technical and descriptive criteria.

Naming Conventions or “What’s in a File Name?”

The most robust metadata you can have is the file name. It can travel through different operating systems and file systems. The file name is the only metadata that is available all the time. It is independent of any database, catalog, MAM system, application, or other mechanism that can keep or read metadata. With it, someone can instantly make sense of a file that gets isolated, left over, misplaced or transferred to another location. Building a solid and intelligent naming convention for media files is crucial. Consistency is key for metadata. Metadata is a solid foundation for the workflow, searching and sharing files with other parties. The filename is the starting point.

Wrapping Up

There is much more that can make a media archive extremely worthwhile and efficient. For further reading I’ve made this free eBook available for more tips on planning and implementation.

eBook:  Data Management, Backup and Archive for Media Professionals — How to Protect Valuable Video Data in All Stages of the Workflow by Marc M. Batschkus

Start looking into the benefits an archive can bring you today. There is a 30-day fully featured trial license for Archiware P5 that can be combined with the Backblaze B2 free trial storage.

Trial License:  About Archiware P5 and 30-Day Trial

And of course, if you’re not already a Backblaze B2 customer, sign up instantly at the link below.

B2 Cloud Storage:  Instant Signup

— Dr. Marc M. Batschkus, Archiware

The post Creating a Media Archive Solution with Backblaze B2 and Archiware P5 appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

cPanel Backup to B2 Cloud Storage

Post Syndicated from Roderick Bauer original https://www.backblaze.com/blog/cpanel-backup-to-b2-cloud-storage/

laptop on a desk with a cup of coffee, cell phone, and iPad

Anyone who’s managed a business or personal website is likely familiar with cPanel, the control panel that provides a graphical interface and tools that simplify the process of managing a website. IT professionals who’ve managed hosting servers might know cPanel’s big brother, WHM (Web Host Manager), which is used by server administrators to manage large web hosting servers and cPanels for their customers.

cPanel Dashboard WHM Dashboard
cPanel Dashboard   WHM Dashboard

Just as with any other online service, backup is critically important to safeguard user and business data from hardware failure, accidental loss, or unforeseen events. Both cPanel and WHM support a number of applications for backing up websites and servers.

JetApps’s JetBackup cPanel App

One of those cPanel applications is JetApps’s JetBackup, which supports backing up data to a number of destinations, including local, remote SSH, remote FTP, and public cloud services. Backblaze B2 Cloud Storage was added as a backup destination in version 3.2. Web hosts that support JetBackup for their cPanel and WHM users include Clook, FastComet, TMDHosting, Kualo, Media Street, ServerCake, WebHost.UK.net, MegaHost, MonkeyTree Hosting, and CloudBunny.

cPanel with JetBackup app

cPanel with JetBackup app

JetBackup configuration for B2

JetBackup configuration for B2

Directions for configuring JetBackup with B2 are available on their website.

Note:  JetBackup version 3.2+ supports B2 cloud storage, but that support does not currently include incremental backups. JetApps has told us that incremental backup support will be available in an upcoming release.

Interested in more B2 Support for cPanel and WHM?

JetBackup support for B2 was added to JetBackup because their users asked for it. Users have been vocal in asking vendors to add cPanel/WHM support for backing up to B2 in forums and online discussions, as evidenced on cPanel.net and elsewhere — here, here, and here. The old axiom that the squeaky wheel gets the grease is true when lobbying vendors to add B2 support — the best way to have B2 directly supported by an app is to express your interest directly to the backup app provider.

Other Ways to Back Up Website Data to B2

When a dedicated backup app for B2 is not available, some cPanel users are creating their own solutions using the B2 Command Line Interface (CLI), while others are using Rclone to back up to B2.

B2 CLI example:

#!/bin/bash
b2 authorize_account ACCOUNTID APIKEY
b2 sync –noProgress /backup/ b2://STORAGECONTAINER/

Rclone example:

rclone copy /backup backblaze:my-server-backups –transfers 16

Those with WordPress websites have other options for backing up their sites, which we highlighted in a post, Backing Up WordPress.

Having a Solid Backup Plan is What’s Important

If you’re using B2 for cPanel backup, or are using your own backup solution, please let us know what you’re doing in the comments.

The post cPanel Backup to B2 Cloud Storage appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

The B2 Developers’ Community

Post Syndicated from Roderick Bauer original https://www.backblaze.com/blog/object-storage-developer-community/

Developers at Work Using Object Storage

When we launched B2 Cloud Storage in September of 2015, we were hoping that the low cost, reliability, and openness of B2 would result in developers integrating B2 object storage into their own applications and platforms.

We’ve continually strengthened and encouraged the development of more tools and resources for the B2 developer community. These resources include APIs, a Command-Line tool, a Java SDK, and code examples for Swift and C++. Backblaze recently added application keys for B2, which enable developers to restrict access to B2 data and control how an application interacts with that data.

An Active B2 Developer Community

It’s three years later and we are happy to see that an active developer community has sprung up around B2. Just a quick look at GitHub shows over 250 repositories for B2 code with projects in ten different languages that range from C# to Go to Ruby to Elixir. A recent discussion on Hacker News about a B2 Python Library resulted in 225 comments.

B2 coding languages - Java, Ruby, C#, Shell, PHP, R, JavaScript, C++, Elixir, Go, Python, Swift

What’s Happening in the B2 Developer Community?

We believe that the two major reasons for the developer activity supporting B2 are, 1) the user demand for inexpensive and reliable storage, and, 2) the ease of implementation of the B2 API. We discussed the B2 API design decisions in a recent blog post.

Sharing and transparency have been cornerstone values for Backblaze since our founding, and we believe openness and transparency breed trust and further innovation in the community. Since we ask customers to trust us with their data, we want our actions to show why we are worthy of that trust.

Here are Just Some of the Many B2 Projects Currently Underway

We’re excited about all the developer activity and all of the fresh and creative ways you are using Backblaze B2 storage. We want everyone to know about these developer projects so we’re spotlighting some of the exciting work that is being done to integrate and extend B2.

Rclone (Go) — In addition to being an open source command line program to sync files and directories to and from cloud storage systems, Rclone is being used in conjunction with other applications such as restic. See Rclone on GitHub, as well.

CORS (General web development) — Backblaze supports CORS for efficient cross-site media serving. CORS allows developers to store large or infrequently accessed files on B2 storage, and then refer to and serve them securely from another website without having to re-download the asset.

b2blaze (Python) — The b2blaze Python library for B2.

Laravel Backblaze Adapter (PHP) — Connect your Laravel project to Backblaze connector with this storage adapter with token caching.

Wal-E (Postgres) — Continuous archiving to Backblaze for your Postgres databases.

Phoenix (Elixir) — File upload utility for the Phoenix web dev framework.

ZFS Backup (Go) — Backup tool to move your ZFS snapshots to B2.

Django Storage (Python) — B2 storage for the Python Django web development framework.

Arq Backup (Mac and Windows application) — Arq Backup is an example of a single developer, Stefan Reitshamer, creating and supporting a successful and well-regarded application for cloud backup. Stefan also is known for being responsive to his users.

Go Client & Libraries (Go) — Go is a popular language that is being used for a number of projects that support B2, including restic, Minio, and Rclone.

How to Get Involved as a B2 Developer

If you’re considering developing for B2, we encourage you to give it a try. It’s easy to implement and your application and users will benefit from dependable and economical cloud storage.

Developers at workStart by checking out the B2 documentation and resources on our website. GitHub and other code repositories are also great places to look. If you follow discussions on Reddit, you could learn of projects in the works and maybe find users looking for solutions.

We’ve written a number of blog posts highlighting the integrations for B2. You can find those by searching for a specific integration on our blog or under the tag B2. Posts for developers are tagged developer.

Developers at work

If you have a B2 integration that you believe will appeal to a significant audience, you should consider submitting it to us. Those that pass our review are listed on the B2 Integrations page on our website. We’re adding more each week. When you’re ready, just review the B2 Integration Checklist and submit your application. We’re looking forward to showcasing your work!

Now’s a good time to join the B2 developers’ community. Jump on in — the water’s great!

P.S. We want to highlight and promote more developers working with B2. If you have a B2 integration or project that we haven’t mentioned in this post, please tell us what you’re working on in the comments.

The post The B2 Developers’ Community appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Backing UP FreeNAS and TrueNAS to Backblaze B2

Post Syndicated from Roderick Bauer original https://www.backblaze.com/blog/how-to-setup-freenas-cloud-storage/

FreeNAS and TrueNAS

Thanks to recent updates of FreeNAS and TrueNAS, backing up data to Backblaze B2 Cloud Storage is now available for both platforms. FreeNAS/TrueNAS v11.1 adds a feature called Cloud Sync, which lets you sync, move, or copy data to and from Backblaze B2.

What Are FreeNAS and TrueNAS?

FreeNAS and TrueNAS are two faces of a comprehensive NAS storage environment built on the FreeBSD OS and OpenZFS file system. FreeNAS is the open source and development platform, while TrueNAS is the supported and commercial product line offered by IXSystems.

FreeNAS logo

FreeNAS is for the DIY crowd. If you don’t mind working with bleeding-edge software and figuring out how to make your software and hardware work harmoniously, then FreeNAS could be a good choice for you.

TrueNAS logo

If you’re in a business or other environment with critical data, then a fully supported product like TrueNAS is likely the way you’ll want to go. IXsystems builds their TrueNAS commercial server appliances on the battle-tested, open source framework that FreeNAS and OpenZFS provide.

The software developed by the FreeNAS open source community forms the basis for both platforms, so we’ll talk specifically about FreeNAS in this post.

Working with FreeNAS

You can download FreeNAS directly from the open source project website, freenas.org. Once installed, FreeNAS is managed through a comprehensive web interface that is supplemented by a minimal shell console that handles essential administrative functions. The web interface supports storage pool configuration, user management, sharing configuration, and system maintenance.

FreeNAS web UI

FreeNAS supports Windows, macOS and Unix clients.

Syncing to B2 with FreeNAS

Files or directories can be synchronized to remote cloud storage providers, including B2, with the Cloud Sync feature.

Selecting Tasks ‣ Cloud Sync shows the screen below. This screen shows a single cloud sync called “backup-acctg” that “pushes” a file to cloud storage. The last run finished with a status of SUCCESS.

Existing cloud syncs can be run manually, edited, or deleted with the buttons that appear when a single cloud sync line is selected by clicking with the mouse.

FreeNAS Cloud Sync status

Cloud credentials must be defined before a cloud sync is created. One set of credentials can be used for more than one cloud sync. For example, a single set of credentials for Backblaze B2 can be used for separate cloud syncs that push different sets of files or directories.

A cloud storage area must also exist. With B2, these are called buckets and must be created before a sync task can be created.

After the credentials and receiving bucket have been created, a cloud sync task is created with Tasks ‣ Cloud Sync ‣ Add Cloud Sync. The Add Cloud Sync dialog is shown below.

FreeNAS Cloud Sync credentials

Cloud Sync Options

The table below shows the options for Cloud Sync.

Setting Value Type Description
Description string a descriptive name for this Cloud Sync
Direction string Push to send data to cloud storage, or Pull to pull data from the cloud storage
Provider drop-down
menu
select the cloud storage provider; the list of providers is defined by Cloud Credentials
Path browse
button
select the directories or files to be sent for Push syncs or the destinations for Pull syncs
Transfer Mode drop-down
menu
Sync (default): make files on destination system identical to those on the source; files removed from the source are removed from the destination (like rsync –delete)
Copy: copy files from the source to the destination, skipping files that are identical (like rsync)
Move: copy files from the source to the destination, deleting files from the source after the copy (like mv)
Minute slider or
minute selections
select Every N minutes and use the slider to choose a value, or select Each selected minute and choose specific minutes
Hour slider or
hour selections
select Every N hours and use the slider to choose a value, or select Each selected hour and choose specific hours
Day of month slider or
day of month
selections
select Every N days of month and use the slider to choose a value, or select Each selected day of month and choose specific days
Month checkboxes months when the Cloud Sync runs
Day of week checkboxes days of the week when the Cloud Sync runs
Enabled checkbox uncheck to temporarily disable this Cloud Sync

Take care when choosing a Direction. Most of the time, Push will be used to send data to the cloud storage. Pull retrieves data from cloud storage, but be careful: files retrieved from cloud storage will overwrite local files with the same names in the destination directory.

Provider is the name of the cloud storage provider. These providers are defined by entering credentials in Cloud Credentials.

After the Provider is chosen, a list of available cloud storage areas from that provider is shown. With B2, this is a drop-down with names of existing buckets.

Path is the path to the directories or files on the FreeNAS system. On Push jobs, this is the source location for files sent to cloud storage. On Pull jobs, the Path is where the retrieved files are written. Again, be cautious about the destination of Pull jobs to avoid overwriting existing files.

The Minute, Hour, Days of month, Months, and Days of week fields permit creating a flexible schedule of when the cloud synchronization takes place.

Finally, the Enabled field makes it possible temporarily disable a cloud sync job without deleting it.

FreeNAS Cloud Sync Example

This example shows a Push cloud sync which writes an accounting department backup file from the FreeNAS system to Backblaze B2 storage.

Before the new cloud sync was added, a bucket called “cloudsync-bucket” was created with the B2 web console for storing data from the FreeNAS system.

System ‣ Cloud Credentials ‣ Add Cloud Credential is used to enter the credentials for storage on a Backblaze B2 account. The credential is given the name B2, as shown in the image below:

FreeNAS Cloud Sync B2 credentials

Note on encryption: FreeNAS 11.1 Cloud Sync does not support client-side encryption of data and file names before syncing to the cloud, whether the destination is B2 or another public cloud provider. That capability will be available in FreeNAS v11.2, which is currently in beta.

Example: Adding Cloud Credentials

The local data to be sent to the cloud is a single file called accounting-backup.bin on the smb-storage dataset. A cloud sync job is created with Tasks ‣ Cloud Sync ‣ Add Cloud Sync.

The Description is set to “backup-acctg” to describe the job. This data is being sent to cloud storage, so this is a Push. The provider comes from the cloud credentials defined in the previous step, and the destination bucket “cloudsync-bucket” has been chosen.

The Path to the data file is selected.

The remaining fields are for setting a schedule. The default is to send the data to cloud storage once an hour, every day. The options provide great versatility in configuring when a cloud sync runs, anywhere from once a minute to once a year.

The Enabled field is checked by default, so this cloud sync will run at the next scheduled time.

The completed dialog is shown below:

FreeNAS Cloud Sync example

Dependable and Economical Disaster Recovery

In the event of an unexpected data-loss incident, the VMs, files, or other data stored in B2 from FreeNAS or TrueNAS are available for recovery. Having that data ready and available in B2 provides a dependable, easy, and cost effective offsite disaster recovery solution.

Are you using FreeNAS or TrueNAS? What tips do you have? Let us know in the comments.

The post Backing UP FreeNAS and TrueNAS to Backblaze B2 appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Minio as an S3 Gateway for Backblaze B2 Cloud Storage

Post Syndicated from Roderick Bauer original https://www.backblaze.com/blog/how-to-use-minio-with-b2-cloud-storage/

Minio + B2

While there are many choices when it comes to object storage, the largest provider and the most recognized is usually Amazon’s S3. Amazon’s set of APIs to interact with their cloud storage, often just called “S3,” is frequently the first integration point for an application or service needing to send data to the cloud.

One of the more frequent questions we get is “how do I jump from S3 to B2 Cloud Storage?” We’ve previously highlighted many of the direct integrations that developers have built on B2: here’s a full list.

Another way to work with B2 is to use what is called a “cloud storage gateway.” A gateway is a service that acts as a translation layer between two services. In the case of Minio, it enables customers to take something that was integrated with the S3 API and immediately use it with B2.

Before going further, you might ask “why didn’t Backblaze just create an S3 compatible service?” We covered that topic in a recent blog post, Design Thinking: B2 APIs (& The Hidden Costs of S3 Compatibility). The short answer is that our architecture enables some useful differentiators for B2. Perhaps most importantly, it enables us to sustainably offer cloud storage at a ¼ of the price of S3, which you will really appreciate as your application or service grows.

However, there are situations when a customer is already using the S3 APIs in their infrastructure and want to understand all the options for switching to B2. For those customers, gateways like Minio can provide an elegant solution.

What is Minio?

Minio is an open source, multi-cloud object storage server and gateway with an Amazon S3 compatible API. Having an S3-compatible API means once configured, Minio acts as a gateway to B2 and will automatically and transparently put or get data into a Backblaze B2 account.

Backup, archive or other software that supports the S3 protocol can be configured to point at Minio. Minio internally translates all the incoming S3 API calls into equivalent B2 storage API calls, which means that all Minio buckets and objects are stored as native B2 buckets and objects. The S3 object layer is transparent to the applications that use the S3 API. This enables the simultaneous use of both Amazon S3 and B2 APIs without compromising any features.

Minio has become a popular solution, with over 113.7M+ Docker pulls. Minio implements the Amazon S3 v2/v4 API in the Minio client, AWS SDK, and in the AWS CLI.

Minio and B2

To try it out, we configured a MacBook Pro with a Docker container for the latest version of Minio. It was a straightforward matter to install the community version of Docker on our Mac and then install the container for Minio.

You can follow the instructions on GitHub for configuring Minio on your system.

In addition to using Minio with S3-compatible applications and creating new integrations using their SDK, one can use Minio’s Command-line Interface (CLI) and the Minio Browser to access storage resources.

Command-line Access to B2

We installed the Minio client (mc), which provides a modern CLI alternative to UNIX coreutils such as ls, cat, cp, mirror, diff, etc. It supports filesystems and Amazon S3 compatible cloud storage services. The Minio client is supported on Linux, Mac, and Windows platforms.

We used the command below to add the alias “myb2” to our host to make it easy to access our data.

mc config host add myb2 \
 http://localhost:9000 b2_account_id b2_application_key

Minio client commands

Once configured, you can use mc subcommands like ls, cp, mirror to manage your data.

Here’s the Minio client command to list our B2 buckets:

mc ls myb2

And the result:

Minio client

Browsing Your B2 Buckets

Minio Gateway comes with an embedded web based object browser that makes it easy to access your buckets and files on B2.

Minio browser

Minio is a Great Way to Try Out B2

Minio is designed to be straightforward to deploy and use. If you’re using an S3-compatible integration, or just want to try out Backblaze B2 using your existing knowledge of S3 APIs and commands, then Minio can be a quick solution to getting up and running with Backblaze B2 and taking advantage of the lower cost of B2 cloud storage.

The post Minio as an S3 Gateway for Backblaze B2 Cloud Storage appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Design Thinking: B2 APIs (& The Hidden Costs of S3 Compatibility)

Post Syndicated from Brian Wilson original https://www.backblaze.com/blog/design-thinking-b2-apis-the-hidden-costs-of-s3-compatibility/

API Functions - Authorize, Download, List, Upload

When we get asked, “why did Backblaze make its own set of APIs for B2?” the question behind the question is most often “why didn’t Backblaze just implement an S3-compatible interface for B2?”

Either are totally reasonable questions to ask. The quick answer to either question? So our customers and partners can move faster while simultaneously enabling Backblaze to sustainably offer a cloud storage service that is ¼ of the price of S3.

But, before we jump all the way to the end, let me step you through our thinking.

The Four Major Functions of Cloud Storage APIs

Throughout cloud storage, S3 or otherwise, APIs are meant to mainly provide access to four major underlying functions:

  • Authorization — providing account/bucket/file access
  • Upload — Sending files to the cloud
  • Download — Data retrieval
  • List — Data checking / selection / comparison

The comparison between B2 and S3 on the List and Download functions is, candidly, not that interesting. Fundamentally, we ended up having similar approaches when solving those challenges. If the detail is of interest, I’m happy to get into that on a later post or answer questions in the comments below.

Backblaze and Amazon did take different approaches to how each service handles Authorization. The 5 step approach for S3 is well outlined here. B2’s architecture enables secure authorization in just 2 steps. My assertion is that a 2 step architecture is ~60% simpler than having a 5 step approach. To understand what we’re doing, I’d like to introduce the concept of Backblaze’s “Contract Architecture.”

The easiest way to understand B2’s Contract Architecture is to deep dive into how we handle the Upload process.

Uploads (Load Balancing vs Contract Architecture)

The interface to upload data into Amazon S3 is actually a bit simpler than Backblaze B2’s API. But it comes at a literal cost. It requires Amazon to have a massive and expensive choke point in their network: load balancers. When a customer tries to upload to S3, she is given a single upload URL to use. For instance, http://s3.amazonaws.com/<bucketname>. This is great for the customer as she can just start pushing data to the URL. But that requires Amazon to be able to take that data and then, in a second step behind the scenes, find available storage space and then push that data to that available location. The second step creates a choke point as it requires having high bandwidth load balancers. That, in turn, carries a significant customer implication; load balancers cost significant money.

When we were creating the B2 APIs, we faced a dilemma — do we go a simple but expensive route like S3? Or is there a way to remove significant cost even if it means introducing some slight complexity? We understood that there are perfectly great reasons to go either way — and there are customers at either end of this decision tree.

We realized the expense savings could be significant; we know load balancers well. We use them for our Download capabilities. They are expensive so, to run a sustainable service, we charge for downloads. That B2 download pricing is 1¢/GB while Amazon S3 starts at 9¢/GB is a subject we covered in a prior blog post.

Back to the B2 Upload function. With our existing knowledge of the “expensive” design, the next step was to understand the alternative path. We found a way to create significant savings by only introducing a modest level of complexity. Here’s how: When a “Client” wants to push data to the servers, it does not just start uploading data to a “well known URL” and have the SERVER figure out where to put the data. At the start, the Client contacts a “Dispatching Server” that has the job of knowing where there is optimally available space in a Backblaze data center.

The Dispatching Server (the API server answering the b2_get_upload_url call) tells the Client “there is space over on “Vault-8329.” This next step is our magic. Armed with the knowledge of the open vault, the Client ends its connection with the Dispatching Server and creates a brand new request DIRECTLY to Vault-8329 (calling b2_upload_file or b2_upload_part). No load balancers involved! This is guaranteed to scale infinitely for very little overhead cost. A side note is that the client can continue to directly call b2_upload_file repeatedly from now on (without asking the dispatch server ever again), up until it gets the response indicating that particular vault is full. In other words, this does NOT double the number of network requests.

The “Contract” concept emanates from a simple truth: all APIs are contracts between two entities (machines). Since the Client knows exactly where to go and exactly what authorization to bring with it, it can establish a secure “contract” with the Vault specified by the Dispatching Server.[1] The modest complexity only comes into play if Vault-8392 fills up, gets too busy, or goes offline. In that case, the Client will receive either a 500 or 503 error as notification that the contract has been terminated (in effect, it’s a firm message that says “stop uploading to Vault-8392, it doesn’t have room for more data”). When this happens, the Client is responsible to go BACK to the Dispatching Server, ask for a new vault, and retry the upload to a different vault. In the scenario where the Client has to go back to the Dispatching Server, the “two phase” process becomes more work for the Client versus S3’s singular “well known URL” architecture. Of course, this is all handled at the code level and is well documented. In effect, your code just needs to know that “if you receive a 500 block error, just retry.” It’s free, it’s easy, and it will work.

So while the Backblaze approach introduces some modest complexity, it can quickly and easily be reliably handled with code. Looking at S3’s approach, it is certainly simpler, but it results in three expensive consequences:

1) Expensive fixed costs. Amazon S3 has a single upload URL choke point that requires load balancers and extremely high bandwidth requirements. Backblaze’s architecture does not require moving data around internally; this lets us use commodity 10 GB/s connections that are affordable and will scale infinitely. Further, as discussed above, load balancing hardware is expensive. By removing it from our Upload system, we remove a significant cost center.

2) Expensive single URL availability issues. Amazon S3’s solution requires high availability of the single upload URL for massive amounts of data. The Contract concept from Backblaze works more reliably, but does add slight additional complexity when (rare) extra network round trips are needed.

3) Expensive, time consuming data copy needs (and “eventual consistency”). Amazon S3 requires the copying of massive amounts of data from one part of their network (the upload server) to wherever the data’s ultimate resting place will be. This is at the root of one of the biggest frustrations when dealing with S3: Amazon’s “eventual consistency.” It means that you can’t use your data until it has been pushed to all the places it needs to go. As the article notes, this is usually fast, but can be material amounts of time, anytime. The lack of predictability around access times is something anyone dealing with S3 is all too familiar with.

The B2 architecture offers what one could consider “strong consistency.” There are different definitions of that idea. Ours is that the client connects DIRECTLY with the correct final location for the data to land. Once our system has confirmed a write, the data has been written to enough places that we can guarantee that the data can be seen without delay.

Was our decision a good one? Customers will continue to vote on that, but it appears that the marginal complexity is more than offset by the fact that B2 is sustainable service offered at ¼ of S3’s price.

But Seriously, Why Aren’t You Just “S3 Compatible?”

The use of Object Storage requires some sort of interface. You can build it yourself by learning your vendor’s APIs or you can go through a third party integration. Regardless of what route you choose, somebody is becoming fluent in the vendor’s APIs. And beyond the difference in cost, there’s a reliability component.

This is a good time to clear up a common misconception. The S3 protocol is not a specification: it’s an API doc. Why does this matter? Because API docs leave many outcomes undocumented. For instance, when one uses S3’s list_files function, a developer canNOT know what is going to happen just by reading the API docs. Compounding this issue is the sheer scope of the S3 API; it is huge and expanding. Systems the purport to be “S3 compatible” are unlikely to implement the full API and have to document whatever subset they implement. Once that is done, they will have to work with integration partners and customers to communicate what subset they choose as “important.”

Ultimately, we have chosen to create robust documentation describing, among other things, the engineering specification (this input returns that output, here’s how B2 handles various error cases, etc).

With hundreds of integrations from third parties and hundreds of thousands of customers, it’s clear that our APIs have proven easy to implement. The reality is the first time anyone implements cloud storage into their application it can take weeks. The first move into the cloud can be particularly tough for legacy applications. But the marginal cloud implementation can be reliably completed in days, if not hours, if the documentation is clear and the APIs can be well understood. I’m pleased that we’ve been able to create a complete solution that is proving quite easy to use.

And it’s a big deal that B2 is free of the “load balancer problem.” It solves for a huge scaling issue. When we roll out new vaults in new data centers in new countries, the clients are contacting those vaults DIRECTLY (over whatever network path is shortest) and so there are fewer choke points in our architecture.

It all means that, over an infinite time horizon, our customers can rely on B2 as the most affordable, easiest to use cloud storage on the planet. And, at the end of the day, if we’re doing that, we’ve done the right thing.


[1] The Contract Architecture also explains how we got to a secure two step Authorization process. When you call the Dispatching Server, we run the authentication process and then give you a Vault for uploads and an Auth token. When you are establishing the contract with the Vault, the Auth token is required before any other operation can begin.

The post Design Thinking: B2 APIs (& The Hidden Costs of S3 Compatibility) appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

What’s New In B2: Application Keys + Java SDK

Post Syndicated from Yev original https://www.backblaze.com/blog/b2-application-keys/

B2 Application Keys

It’s been a few months since our last “What’s New In B2” blog post, so we wanted to highlight some goings on and also introduce a new B2 feature!

Reintroducing: Java SDK + Compute Partnerships

We wanted to highlight the official Backblaze B2 Java SDK which can be found in our GitHub repo. The official Java SDK came out almost a year ago, but we’ve been steadily updating it since then with help from the community.

We’ve also announced some Compute Partnerships which give folks all the benefits of Backblaze B2’s low-cost cloud storage with the computing capabilities of Packet and ServerCentral. Backblaze B2 Cloud storage has directly connected to the compute providers, which offers customers low latency and free data transfers with B2 Cloud Storage.

Application Keys

Application keys give developers more control over who can do what and for how long with their B2 data. We’ve had the B2 application key documentation out for a while, and we’re ready to take off the “coming soon” tag.

row of keys

What are Application Keys?

In B2, the main application key has root access to everything and essentially controls every single operation that can be done inside B2. With the introduction of additional application keys, developers now have more flexibility.

Application keys are scoped by three things: 1) what operations the key can do, 2) what path inside of B2 that key can take, and 3) for how long it has the ability to do so. For example you might use a “read-only” key that only has access to one B2 bucket. You’d use that read-only key in situations where you don’t actually need to write things to the bucket, only read or “display” them. Or, you might use a “write-only” key which can only write to a specific folder inside of a bucket. All of this leads to cleaner code with segmented operations, essentially acting as firewalls should something go awry.

Application keys dialog screenshot

Use Cases for Application Keys

One example of how you’d use an application key is for a standard backing up operation. If you’re backing up an SQL database, you do not need to use your root level key to do so. Simply creating a key that can only upload to a specified folder is good enough.

Another example is that of a developer building apps inside of a client. That developer would want to restrict access and limit privileges of each client to specific buckets and folders — usually based on the client that is doing the operation. Using more locked-down application keys limits the possibility that one rogue client can affect the entire system.

A final case could be a Managed Service Provider (MSP) who creates and uses different application key for each client. That way, neither the client nor the MSP can accidentally access the files of another client. In addition, an MSP could have multiple application keys for a given client that define different levels of data access for given groups or individuals within the client’s organization.

We Hope You Like It

Are you one of the people that’s been waiting for application key support? We’d love to hear your use cases so sound off in the comments below with what you’re working on!

The post What’s New In B2: Application Keys + Java SDK appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Five Tips For Creating a Predictable Cloud Storage Budget

Post Syndicated from Andy Klein original https://www.backblaze.com/blog/calculate-cost-cloud-storage/

Cloud Storage $$$, Transfer Rates $, Download Fees $$, Cute Piggy Bank $$$

Predicting your cloud storage cost should be easy. After all, there are only three cost dimensions: 1) storage (the rental for your slice of the cloud), 2) download (the fee to bring your data out of the cloud), and 3) transactions (charges for “stuff” you might do to your data inside the cloud). Yet, you probably know someone (you?) that was more than surprised when their cloud storage bill arrived. They have good company, as according to ZDNet, 37% of IT executives found their cloud storage costs to be unpredictable.

Here are five tips you can use when doing your due diligence on the cloud storage vendors you are considering. The goal is to create a cloud storage forecast that you can rely on each and every month.

Tip # 1 — Don’t Miscalculate Progressive (or is it Regressive?) Pricing Tiers

The words “Next” or “Over” on a pricing table are never a good thing.

Standard Storage Pricing Example

  • First 50 TB / Month $0.026 per GB
  • Next 450 TB / Month $0.025 per GB
  • Over 500 TB / Month $0.024 per GB

Those words mean there are tiers in the pricing table which, in this case, means you have to reach a specific level to get better pricing. You don’t get a retroactive discount — only the data above the minimum threshold enjoys the lower price.

The mistake sometimes made is calculating your entire storage cost based on the level for that amount of storage. For example, if you had 600 TB of storage, you could wrongly multiply as follows:

(600,000 x 0.024) = $14,400/month

When, in fact, you should do the following:

(50,000 x 0.026) + (450,000 x 0.025) + (100,000 x 0.024) = $15,150/month

That was just for storage. Make sure you consider the tiered pricing tables for data retrieval as well.

Tip # 2 — Don’t Choose the Wrong Service Level

Many cloud storage providers offer multiple levels of service. The idea is that you can trade service capabilities for cost. If you don’t need immediate access to your files or don’t want data replication or eleven 9s of durability, there is a choice for you. Besides giving away functionality, there’s a bigger problem. You have to know what you are going to do with your data to pick the right service because mistakes can get very expensive. For example:

  • You choose a low cost service tier that normally takes hours or days to restore your data. What can go wrong? You need some files back immediately and you end up paying 10-20 times the cost to expedite your restore.
  • You choose one level of service and decide you want to upload some data to a compute-based application or to another region — features not part of your current service. The good news? You can usually move the data. The bad news? You are charged a transfer fee to move the data within the same vendor’s infrastructure because you didn’t choose the right service tier when you started. These fees often eradicate any “savings” you had gotten from the lower priced tier.

Basically, if your needs change as they pertain to the data you have stored, you will pay more than you expect to get all that straightened out.

Tip # 3 — Don’t Pay for Deleted Files

Some cloud storage companies have a minimum amount of time you are charged for storage for each file uploaded. Typically this minimum period is between 30 and 90 days. You are charged even if you delete the file before the minimum period. For example (assuming a 90 day minimum period), if you upload a file today and delete the file tomorrow, you still have to pay for storing that deleted file for the next 88 days.

This “feature” often extends to files deleted due to versioning. Let’s say you want to keep three versions of each file, with older versions automatically deleted. If the now deleted versions were originally uploaded fewer than 90 days ago, you are charged for storing them for 90 days.

Using a typical backup scenario let’s say you are using a cloud storage service to store your files and your backup program is set to a 30 day retention. That means you will be perpetually paying for an additional 60 days worth of storage (for files that were pruned at 30 days). In other words, you would be paying for a 90 day retention period even though you only have 30 days worth of files.

Tip # 4 — Don’t Pay For Nothing

Some cloud storage vendors charge a minimum amount each month regardless of how little you have stored. For example, even if you only have 100 GB stored you get to pay like you have 1 TB (the minimum). This is the moral equivalent of a bank charging you a monthly fee if you don’t meet the minimum deposit amount.

Continuing on the theme of paying for nothing, be on the lookout for services that charge a minimum amount per each file stored regardless of how small the file is, including zero bytes. For example, some storage services have a minimum file size of 128K. Any files smaller than that are counted as being 128K for storage purposes. While the additional cost for even a couple of million zero-length files is trivial, you’re still being charged something for nothing.

Tip # 5 — Be Suspicious of the Fine Print

Misdirection is the art of getting you to focus on one thing so you don’t focus on other things going on. Practiced by magicians and some cloud storage companies, the idea is to get you to focus on certain features and capabilities without delving below the surface into the fine print.

Read the fine print and as you stroll through the multi-page pricing tables and linked pages of all of the rules that shape how you can use a given cloud storage service. Stop and ask, “what are they trying to hide?” If you find phrases like: “we reserve the right to limit your egress traffic,” or “new users gets free usage tier for 12 months,” or “provisioned requests should be used when you need a guarantee that your retrieval capacity will be available when you need it,” take heed.

How to Build a Predictable Cloud Storage Budget

As we noted previously, cloud storage costs are composed of three dimensions: storage, download and transactions. These are the cost drivers for cloud storage providers, and as such are the most straightforward way for service providers to pass on the cost of the service to its customers.

Let’s start with data storage as it is the easiest for a company to calculate. For a given month data storage cost is equal to:

Current data + new data – deleted data

Take that total and multiple by the monthly storage rate and you’ll get your monthly storage costs.

Computing download and transaction costs can be harder as these are variables you may have never calculated before, especially if you previously were using in-house or LTO-based storage. To help you out, below is a chart showing the breakdown of the revenue from Backblaze B2 Cloud Storage over the past 6 months.

% of Spend w/ B2

As you can see, download (2%) and transaction (3%) costs are, on average, minimal compared to storage costs. Unless you have reason to believe you are different, using these figures is a good proxy for your costs.

Let’s Give it a Try

Let’s start with 100 TB of original storage then add 10 TB each month and delete 5 TB each month. That’s 105 TB of storage for the first month. Backblaze has built a cloud storage calculator that computes costs for all of the major cloud storage providers. Using this calculator, we find that Amazon S3 would cost $2,205.50 to store this data for a month, while Backblaze B2 would charge just $525.10.

Using those numbers for storage and assuming that storage will be 95% of your total bill (as noted in the chart above), you get a total monthly cost of $2,321.05 for Amazon S3 and Backblaze B2 will be $552.74 a month.

The chart below provides the breakdown of the expected cost.

Backblaze B2 Amazon S3
Storage $525.10 $2,205.50
Download $11.06 $42.22
Transactions $16.58 $69.33
Totals: $552.74 $2,321.05

Of course each month you will add and delete storage, so you’ll have to account for that in your forecast. Using the cloud storage calculator noted above, you can get a good sense of your total cost over the budget forecasting period.

Finally, you can use the Backblaze B2 storage calculator to address potential use cases that are outside of your normal operation. For example, you delete a large project from your storage or you need to download a large amount of data. Running the calculator for these types of actions lets you obtain a solid estimate for their effect on your budget before they happen and lets you plan accordingly.

Creating a predictable cloud storage forecast is key to taking full advantage of all of the value in cloud storage. Organizations like Austin City Limits, Fellowship Church, and Panna Cooking were able to move to the cloud because they could reliably predict their cloud storage cost with Backblaze B2. You don’t have to let pricing tiers, hidden costs and fine print stop you. Backblaze makes predicting your cloud storage costs easy.

The post Five Tips For Creating a Predictable Cloud Storage Budget appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Hard Drive Stats for Q2 2018

Post Syndicated from Andy Klein original https://www.backblaze.com/blog/hard-drive-stats-for-q2-2018/

Backblaze Drive Stats Q2 2018

As of June 30, 2018 we had 100,254 spinning hard drives in Backblaze’s data centers. Of that number, there were 1,989 boot drives and 98,265 data drives. This review looks at the quarterly and lifetime statistics for the data drive models in operation in our data centers. We’ll also take another look at comparing enterprise and consumer drives, get a first look at our 14 TB Toshiba drives, and introduce you to two new SMART stats. Along the way, we’ll share observations and insights on the data presented and we look forward to you doing the same in the comments.

Hard Drive Reliability Statistics for Q2 2018

Of the 98,265 hard drives we were monitoring at the end of Q2 2018, we removed from consideration those drives used for testing purposes and those drive models for which we did not have at least 45 drives. This leaves us with 98,184 hard drives. The table below covers just Q2 2018.

Backblaze Q2 2018 Hard Drive Failure Rates

Notes and Observations

If a drive model has a failure rate of 0%, it just means that there were no drive failures of that model during Q2 2018.

The Annualized Failure Rate (AFR) for Q2 is just 1.08%, well below the Q1 2018 AFR and is our lowest quarterly AFR yet. That said, quarterly failure rates can be volatile, especially for models that have a small number of drives and/or a small number of Drive Days.

There were 81 drives (98,265 minus 98,184) that were not included in the list above because we did not have at least 45 of a given drive model. We use 45 drives of the same model as the minimum number when we report quarterly, yearly, and lifetime drive statistics. The use of 45 drives is historical in nature as that was the number of drives in our original Storage Pods.

Hard Drive Migrations Continue

The Q2 2018 Quarterly chart above was based on 98,184 hard drives. That was only 138 more hard drives than Q1 2018, which was based on 98,046 drives. Yet, we added nearly 40 PB of cloud storage during Q1. If we tried to store 40 PB on the 138 additional drives we added in Q2 then each new hard drive would have to store nearly 300 TB of data. While 300 TB hard drives would be awesome, the less awesome reality is that we replaced over 4,600 4 TB drives with nearly 4,800 12 TB drives.

The age of the 4 TB drives being replaced was between 3.5 and 4 years. In all cases their failure rates were 3% AFR (Annualized Failure Rate) or less, so why remove them? Simple, drive density — in this case three times the storage in the same cabinet space. Today, four years of service is the about the time where it makes financial sense to replace existing drives versus building out a new facility with new racks, etc. While there are several factors that go into the decision to migrate to higher density drives, keeping hard drives beyond that tipping point means we would be under utilizing valuable data center real estate.

Toshiba 14 TB drives and SMART Stats 23 and 24

In Q2 we added twenty 14 TB Toshiba hard drives (model: MG07ACA14TA) to our mix (not enough to be listed on our charts), but that will change as we have ordered an additional 1,200 drives to be deployed in Q3. These are 9-platter Helium filled drives which use their CMR/PRM (not SMR) recording technology.

In addition to being new drives for us, the Toshiba 14 TB drives also add two new SMART stat pairs: SMART 23 (Helium condition lower) and SMART 24 (Helium condition upper). Both attributes report normal and raw values, with the raw values currently being 0 and the normalized values being 100. As we learn more about these values, we’ll let you know. In the meantime, those of you who utilize our hard drive test data will need to update your data schema and upload scripts to read in the new attributes.

By the way, none of the 20 Toshiba 14 TB drives have failed after 3 weeks in service, but it is way too early to draw any conclusions.

Lifetime Hard Drive Reliability Statistics

While the quarterly chart presented earlier gets a lot of interest, the real test of any drive model is over time. Below is the lifetime failure rate chart for all the hard drive models in operation as of June 30th, 2018. For each model, we compute its reliability starting from when it was first installed

Backblaze Lifetime Hard Drive Failure Rates

Notes and Observations

The combined AFR for all of the larger drives (8-, 10- and 12 TB) is only 1.02%. Many of these drives were deployed in the last year, so there is some volatility in the data, but we would expect this overall rate to decrease slightly over the next couple of years.

The overall failure rate for all hard drives in service is 1.80%. This is the lowest we have ever achieved, besting the previous low of 1.84% from Q1 2018.

Enterprise versus Consumer Hard Drives

In our Q3 2017 hard drive stats review, we compared two Seagate 8 TB hard drive models: one a consumer class drive (model: ST8000DM002) and the other an enterprise class drive (model: ST8000NM0055). Let’s compare the lifetime annualized failure rates from Q3 2017 and Q2 2018:

Lifetime AFR as of Q3 2017

    – 8 TB consumer drives: 1.1% annualized failure rate
    – 8 TB enterprise drives: 1.2% annualized failure rate

Lifetime AFR as of Q2 2018

    – 8 TB consumer drives: 1.03% annualized failure rate
    – 8 TB enterprise drives: 0.97% annualized failure rate

Hmmm, it looks like the enterprise drives are “winning.” But before we declare victory, let’s dig into a few details.

  1. Let’s start with drive days, the total number of days all the hard drives of a given model have been operational.- 8 TB consumer (model: ST8000DM002): 6,395,117 drive days
    – 8 TB enterprise (model: ST8000NM0055): 5,279,564 drive daysBoth models have a sufficient number of drive days and are reasonably close in their total number. No change to our conclusion do far.
  2. Next we’ll look at the confidence intervals for each model to see the range of possibilities within two deviations:- 8 TB consumer (model: ST8000DM002): Range 0.9% to 1.2%
    – 8 TB enterprise (model: ST8000NM0055): Range 0.8% to 1.1%The ranges are close, but multiple outcomes are possible. For example, the consumer drive could be as low as 0.9% and the enterprise drive could be as high as 1.1%. This doesn’t help or hurt our conclusion.
  3. Finally we’ll look at drive age — actually average drive age to be precise. This is the average time in operational service, in months, of all the drives of a given model. We’ll will start with the point in time when each drive reached approximately the current number of drives. That way the addition of new drives (not replacements) will have a minimal effect.
    Annualized Hard Drive Failure Rates by TimeWhen you constrain for drive count and average age, the AFR (annualized failure rate) of the enterprise drive is consistently below that of the consumer drive for these two drive models — albeit not by much.

Whether every enterprise model is better than every corresponding consumer model is unknown, but below are a few reasons you might choose one class of drive over another:

Enterprise Consumer
Longer Warranty: 5 vs. 2 years Lower price: up to 50% less
More features, i.e. PowerChoice technology Similar annualized failure rate as enterprise drives
Faster reads and writes Uses less power

Backblaze is known to be “thrifty” when purchasing drives. When you purchase 100 drives at a time or are faced with a drive crisis, it makes sense to purchase consumer drives. When you starting purchasing 100 petabytes’ worth of hard drives at a time, the price gap between enterprise and consumer drives shrinks to the point where the other factors come into play.

Hard Drives By the Numbers

Since April 2013, Backblaze has recorded and saved daily hard drive statistics from the drives in our data centers. Each entry consists of the date, manufacturer, model, serial number, status (operational or failed), and all of the SMART attributes reported by that drive. Currently there are over 100 million entries. The complete data set used to create the information presented in this review is available on our Hard Drive Test Data page. You can download and use this data for free for your own purpose. All we ask are three things: 1) you cite Backblaze as the source if you use the data, 2) you accept that you are solely responsible for how you use the data, and 3) you do not sell this data to anyone. It is free.

If you just want the summarized data used to create the tables and charts in this blog post, you can download the ZIP file containing the MS Excel spreadsheet.

Good luck and let us know if you find anything interesting in the comments below or by contacting us directly.

The post Hard Drive Stats for Q2 2018 appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Backblaze Durability is 99.999999999% — And Why It Doesn’t Matter

Post Syndicated from Brian Wilson original https://www.backblaze.com/blog/cloud-storage-durability/

Dials that go to 11

One of the most often talked about, but least understood, metrics in our industry is the concept of “data durability.” It is often talked about in that nearly everyone quotes some number of nines, and it is least understood in that no one tells you how they actually computed the number or what they actually mean by it.

It strikes us as odd that so much of the world depends on the concept of RAID and Encodings, but the calculations are not standard or agreed upon. Different web calculators allow you to input some variables but not the correct or most important variables. In almost all cases, they obscure the math behind how they spit out their final numbers. There are a few research papers, but hardly a consensus. There just doesn’t seem to be an agreed upon standard calculation of how many “9s” are in the final result. We’d like to change that.

In the same spirit of transparency that leads us to publish our hard drive performance stats, open source our Reed-Solomon Erasure Code, and generally try to share as much of our underlying architecture as practical, we’d like to share our calculations for the durability of data stored with us.

We are doing this for two reasons:

  1. We believe that sharing, where practical, furthers innovation in the community.
  2. Transparency breeds trust. We’re in the business of asking customers to trust us with their data. It seems reasonable to demonstrate why we’re worthy of your trust.

11 Nines Data Durability for Backblaze B2 Cloud Storage

At the end of the day, the technical answer is “11 nines.” That’s 99.999999999%. Conceptually, if you store 1 million objects in B2 for 10 million years, you would expect to lose 1 file. There’s a higher likelihood of an asteroid destroying Earth within a million years, but that is something we’ll get to at the end of the post.

How to Calculate Data Durability

Amazon’s CTO put forth the X million objects over Y million years metaphor in a blog post. That’s a good way to think about it — customers want to know that their data is safe and secure.

When you send us a file or object, it is actually broken up into 20 pieces (“shards”). The shards overlap so that the original file can be reconstructed from any combination of any 17 of the original 20 pieces. We then store those pieces on different drives that sit in different physical places (we call those 20 drives a “tome”) to minimize the possibility of data loss. When one drive fails, we have processes in place to “rebuild” the data for that drive. So, to lose a file, we have to have four drives fail before we had a chance to rebuild the first one.

The math on calculating all this is extremely complex. Making it even more interesting, we debate internally whether the proper calculation methodology is to use the Poisson distribution (the probability of continuous events occurring) or Binomial (the probability of discrete events). We spent a shocking amount of time debating this and believe that both arguments have merits. Rather than posit one absolute truth, we decided to publish the results of both calculations (spoiler alert: Either methodology tells you that your files are safe with Backblaze).

The math is difficult to follow unless you have some facility with advanced statistics. We’ll forgive you if you want to skip the sections entirely, just click here.

Poisson Distribution

When dealing with the probability of X number of events occuring in a fixed period of time, a good place to start is the Poisson distribution.[1]

For inputs, we use the following assumptions:[2]

The average rebuild time to achieve complete parity for any given B2 object with a failed drive is 6.5 days.
A given file uploaded to Backblaze is split into 20 “shards” or pieces. The shards are distributed across multiple drives in a way that any drive can fail and the file is fully recoverable — a file is not lost unless four drives were to fail in a given vault before they could be “rebuilt.” This rebuild is enabled through our Reed-Solomon Erasure Code. Once one drive fails, the other shards are used to “rebuild” the data on the original drive (creating, for all practical purposes, an exact clone of the original drive).

The rule of thumb we use is that for every 1 TB needed to be rebuilt, one should allow 1 day. So a 12 TB drive would, on average, be rebuilt after 12 days. In practice, that number may vary based on a variety of factors, including, but not limited to, our team attempting to clone the failed drive before starting the rebuild process. Based on whatever else may be happening at a given time, a single failed drive may also not be addressed for one day. (Remember, a single drive failure has a dramatically different implication than a hypothetical third drive failure within a given vault — different situations would call for different operational protocols.) For the purposes of this calculation, and a desire to provide simplicity where possible, we assumed an average of a one day lag time before we start the rebuild.

The annualized failure rate of a drive is 0.81%.
For the trailing 60 days while we were writing this post, our average drive failure rate was 0.81%. Long time readers of our blog will also note that hard drive failure rates in our environment have fluctuated over time. But we also factor in the availability of data recovery services including, but not limited to, those offered by our friends at DriveSavers. We estimate a 50% likelihood of full (100%) data recovery from a failed drive that’s sent to DriveSavers. That cuts the effective failure rate in half to 0.41%.

For our Poisson calculation, we use this formula:

Poisson Calculation

The values for the variables are:

  • Annual average failure rate = 0.0041 per drive per year on average
  • Interval or “period” = 156 hours (6.5 days)
  • Lambda = ((0.0041 * 20)/((365*24)/156)) =0.00146027397 for every “interval or period”
  • e = 2.7182818284
  • k = 4 (we want to know the probability of 4 “events” during this 156 hour interval)

Here’s what it looks like:

Poisson Calculation

Poisson calculation enumerated

If you’re following along at home, type this into an infinite precision calculator:[3]

(2.7182818284^(-0.00146027397)) * (((0.00146027397)^4)/(4*3*2*1))

The sub result for 4 simultaneous drive failures in 156 hours = 1.89187284e-13. That means the probability of it NOT happening in 156 hours is (1 – 1.89187284e-13) which equals 0.999999999999810812715 (12 nines).

But there’s a “gotcha.” You actually should calculate the probability of it not happening by considering that there are 56 “156 hour intervals” in a given year. That calculation is:

= (1 – 1.89187284e-13)^56
= (0.999999999999810812715)^56
= 0.99999999999 (11 “nines”)

Yes, while this post claims that Backblaze achieves 11 nines worth of durability, at least one of our internal calculations comes out to 12 nines. Why go with 11 and not 12?

  1. There are different methodologies to calculate the number, so we are publishing the most conservative result.
  2. It doesn’t matter (skip to the end of this post for more on that).

Binomial Distribution

For those interested in getting into the full detail of this calculation, we made a public repository on GitHub. It’s our view on how to calculate the durability of data stored with erasure coding, assuming a failure rate for each shard, and independent failures for each shard.

First, some naming. We will use these names in the calculations:

  • S is the total number of shards (data plus parity)
  • R is the repair time for a shard in days: how long it takes to replace a shard after it fails
  • A is the annual failure rate of one shard
  • F is the failure rate of a shard in R days
  • P is the probability of a shard failing at least once in R days
  • D is the durability of data over R days: not too many shards are lost

With erasure coding, your data remains intact as long as you don’t lose more shards than there are parity shards. If you do lose more, there is no way to recover the data.

One of the assumptions we make is that it takes R days to repair a failed shard. Let’s start with a simpler problem and look at the data durability over a period of R days. For a data loss to happen in this time period, P+1 shards (or more) would have to fail.

We will use A to denote the annual failure rate of individual shards. Over one year, the chances that a shard will fail is evenly distributed over all of the R-day periods in the year. We will use F to denote the failure rate of one shard in an R-day period:

The probability of failure of a single shard in R days is approximately F, when F is small. The exact value, from the Poisson distribution is:

Given the probability of one shard failing, we can use the binomial distribution’s probability mass function to calculate the probability of exactly n of the S shards failing:

We also lose data if more than n shards fail in the period. To include those, we can sum the above formula for n through S shards, to get the probability of data loss in R days:

The durability in each period is inverse of that:

Durability over the full year happens when there’s durability in all of the periods, which is the product of probabilities:

And that’s the answer!

 

For the full calculation and explanation, including our Python code, please visit the GitHub repo:

https://github.com/Backblaze/erasure-coding-durability/blob/master/calculation.ipynb

We’d Like to Assure You It Doesn’t Matter

For anyone in the data business, durability and reliability are very serious issues. Customers want to store their data and know it’s there to be accessed when it’s needed. Any relevant system in our industry must be designed with a number of protocols in place to insure the safety of our customer’s data.

But at some point, we all start sounding like the guitar player for Spinal Tap. Yes, our nines go to 11. Where is that point? That’s open for debate. But somewhere around the 8th nine we start moving from practical to purely academic.[4] Why? Because at these probability levels, it’s far more likely that:

  • An armed conflict takes out data center(s).
  • Earthquakes / floods / pests / or other events known as “Acts of God” destroy multiple data centers.
  • There’s a prolonged billing problem and your account data is deleted.

That last one is particularly interesting. Any vendor selling cloud storage relies on billing its customers. If a customer stops paying, after some grace period, the vendor will delete the data to free up space for a paying customer.

Some customers pay by credit card. We don’t have the math behind it, but we believe there’s a greater than 1 in a million chance that the following events could occur:

  • You change your credit card provider. The credit card on file is invalid when the vendor tries to bill it.
  • Your email service provider thinks billing emails are SPAM. You don’t see the emails coming from your vendor saying there is a problem.
  • You do not answer phone calls from numbers you do not recognize; Customer Support is trying to call you from a blocked number; they are trying to leave voicemails but the mailbox is full.

If all those things are true, it’s possible that your data gets deleted simply because the system is operating as designed.

What’s the Point? All Hard Drives Will Fail. Design for Failure.

Durability should NOT be taken lightly. Backblaze, like all the other serious cloud providers, dedicates valuable time and resources to continuously improving durability. As shown above we have 11 nines of durability. More importantly, we continually invest in our systems, processes, and people to make improvements.

Any vendor that takes the obligation to protect customer data seriously is deep into “designing for failure.” That requires building fault tolerant systems and processes that help mitigate the impact of failure scenarios. All hard drives will fail. That is a fact. So the question really is “how have you designed your system so it mitigates failures of any given piece?”

Backblaze’s architecture uses erasure code to reliably get any given file stored in multiple physical locations (mitigating against specific types of failures like a faulty power strip). Backblaze’s business model is profitable and self-sustaining and provides us with the resources and wherewithal to make the right decisions. We also make the decision to do things like publish our hard drive failure rates, our cost structure, and this post. We also have a number of ridiculously intelligent, hard working people dedicated towards improving our systems. Why? Because the obligation around protecting your data goes far beyond the academic calculation of “durability” as defined by hard drive failure rates.

Eleven years in and counting, with over 600 petabytes of data stored from customers across 160 countries, and well over 30 billion files restored, we confidently state that our system has scaled successfully and is reliable. The numbers bear it out and the experiences of our customers prove it.

And that’s the bottom line for data durability.


[1] One aspect of the Poisson distribution is that it assumes that the probability of failure is constant over time. Hard drives, in Backblaze’s environment, exhibit a “bathtub curve” for failures (higher likelihood of failure when they are first turned on and at the forecasted end of usable life). While we ran various internal models to account for that, it didn’t have a practical effect on the calculation. In addition, there’s some debate to be had about what the appropriate model is — at Backblaze, hard drives are thoroughly tested before putting them into our production system (affecting the theoretical extreme front end of the bathtub curve). Given all that, for the sake of a semblance of simplicity, we present a straightforward Poisson calculation.

[2] This is an area where we should emphasize the conceptual nature of this exercise. System design and reality can diverge.

[3] The complexity will break most standard calculators.

[4] Previously, Backblaze published its durability to be 8 nines. At the time, it reflected what we knew about drive failure rates and recovery times. Today, the failure rates are favorable. In addition, we’ve worked on and continue to innovate solutions around speeding up drive replacement time.

The post Backblaze Durability is 99.999999999% — And Why It Doesn’t Matter appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

B2 Quickstart Guide

Post Syndicated from Roderick Bauer original https://www.backblaze.com/blog/b2-quickstart-guide/

B2 Quickstart Guide
If you’re ready to get started with B2 Cloud Storage, these tips and resources will quickly get you up and running with B2.

What can you do with B2, our low-cost, fast, and easy object cloud storage service?

  • Creative professionals can archive their valuable photos and videos
  • Backup, archive, or sync servers and NAS devices
  • Replace tape backup systems (LTO)
  • Host and serve text, photos, videos, and other web content
  • Build apps that demand affordable cloud storage

B2 cloud storage logo

If you haven’t created an account yet, you can do that here.

Just For Newbies

Are you a beginner to B2? Here’s just what you need to get started.

Saving photos to the cloud

Developer or Old Hat at the Cloud?

If you’re a developer or more experienced with cloud storage, here are some resources just for you.

diagram of how to save files to the cloud

Have a NAS You’d Like to Link to the Cloud?

Would you like to get more out of your Synology, QNAP, or Morro Data NAS? Take a look at these posts that enable you to easily extend your local data storage to the cloud.

diagram of NAS to cloud backup

Looking for an Integration to Super Charge the Cloud?

We’ve blogged about the powerful integrations that work with B2 to provide solutions for a wide-range of backup, archiving, media management, and computing needs. The links below are just a start. You can visit our Integrations page or search our blog for the integration that interests you.

diagram of cloud computing integrations

We hope that gets you started. There’s plenty more about B2 on our blog in the “Cloud Storage” category and in our Knowledge Base.

Didn’t find what you need to get started with B2? Ask away in the comments.

The post B2 Quickstart Guide appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Five Cool Multi-Cloud Backup Solutions You Should Consider

Post Syndicated from Roderick Bauer original https://www.backblaze.com/blog/multi-cloud-backup-solutions/

Multi-Cloud backup

We came across Kevin Soltow’s blog, VMWare Blog, and invited him to contribute a post on a topic that’s getting a lot of attention these days: using multiple public clouds for storing data to increase data redundancy and also to save money. We hope you enjoy the post.
Kevin Soltow
Kevin lives in Zumikon, Switzerland where he works as a Storage Consultant specializing in VMware technologies and storage virtualization. He gradudated from Swiss Federal Institute of Technology in Zurich, and now works on implementing disaster recovery and virtual SAN solutions.

Nowadays, it’s hard to find a company without a backup or DR strategy in place. An organization’s data has become one of its most important assets, so making sure it remains safe and available are key priorities. But does it really matter where your backups are stored? If you follow the “3-2-1” backup rule, you know the answer. You should have at least three copies of your data, two of which are local but on different media, and at least one copy is offsite. That all sounds reasonable.

What about the devices to store your backup data?

Tapes — Tapes were first, had large capacity, the ability to keep data for a long time, but unfortunately they were slow. Historically, they have been less expensive than disks.

Disks — Disks have great capacity, are more durable and faster than tapes, and are improving rapidly in capacity, speed, and cost-per-unit-stored.

In a previous post, Looking for the most affordable cloud storage? AWS vs Azure vs Backblaze B2, I looked at the cost of public cloud storage. With reasonably-priced cloud services available, cloud can be that perfect offsite option for keeping your data safe.

No cloud storage provider can guarantee 100% accessibility and security. Sure, they get close to this number, with claims of 99-point-some-number-of-nines durability, but an unexpected power outage or disaster could knock out their services for minutes, hours, or longer. This happened last year to Amazon S3 when the service suffered from a service disruption. This year, S3 was down due to a power outage. Fortunately, Amazon did not lose their customers’ data. The key words in the 3-2-1 backup rule are at least one copy offsite. More is always better.

Keeping data in multiple clouds provides a clear advantage for reliability, but it also can provide a cost savings, as well. Using multiple cloud providers can simultaneously provide geographically dispersed backups while taking advantage of the lower storage costs available from competitively-priced cloud providers.

In this post, we take a closer look at solutions that support multiple public clouds and allow you to keep several backup copies in different and dispersed clouds.

The Backup Process

The backup process is illustrated in the figure below:

diagram of the backup process from local to cloud

Some solutions create backups and move them to the repository. Data is kept there for a while and then shifted to the cloud where it stays as long as needed.

In this post, I discuss the dedicated software serving as a “data router,” in other words, the software involved in the process of moving data from some local repository to one or more public clouds.

software to send backups to cloud diagram

Let’s have a look at the options we have to achieve this.

1 — Rclone

When I considered solutions that let you back up your data to several clouds, Rclone and CloudBerry were the first solutions that popped into my head. Rclone acts as a data mover synchronizing your local repository with cloud-based object storage. You basically create a backup using something else (e.g. Veeam Backup & Replication), allocate it on-premises, and the solution sends it to several clouds. First developed for Linux, Rclone has a command-line interface to sync files and directories between clouds.

OS Compatibility

The solution can be run on all OS’s using the command-line interface.

Cloud Support

The solution works with most popular public cloud storage platforms, such as Backblaze B2, Microsoft Azure, Amazon S3 and Glacier, Google Cloud Platform, etc.

Feature set

Rclone commands work wonderfully on whatever remote storage system, be it public cloud storage or just your backup server located somewhere else. It also can send data to multiple places simultaneously, but bi-directional sync does not work yet. In other words, changes you make to your files in the cloud do not affect their local copies. The synchronization process is incremental on the file-by-file basis. It should also be noted that Rclone preserves timestamps on files, which helps when searching for the right backup.

This solution provides two options for moving data to the cloud: sync and copy. The first one, sync, allows moving the backups to the cloud automatically as soon as they appear in the specified local directory. The second mode, copy, as expected from its name, allows only copying data from on-premises to the cloud. Deleting your files locally won’t affect the ones stored in the cloud. There’s also an option to verify hash equality.

Learn more about Rclone: https://rclone.org/

2 — CloudBerry Backup

CloudBerry Backup is built from the self-titled backup technology developed for service providers and enterprise IT departments. It is a cross-platform solution. Note that it’s full-fledged backup software, allowing you to not only move backups to the cloud but also create them.

OS compatibility

CloudBerry is a cross-platform solution.

Cloud Support

So far, the solution can talk to Backblaze B2, Microsoft Azure, Amazon S3 and Glacier, Google Cloud Platform, and more.

Feature set

Designed for large IT departments and managed service providers, CloudBerry Backup provides some features that make the solution really handy for the big guys. It offers the opportunity for client customization up to and including the complete rebranding of the solution.

Let’s look at the backup side of this thing. The solution allows backing up files and directories manually. If you prefer, you can sync the selected directory to the root of the bucket. CloudBerry Backup also can schedule backups. Now, you won’t miss them! Another cool thing is backup jobs management and monitoring. Thanks to this feature you are always aware of backup processes on the client machines.

The solution offers AES 256-bit end-to-end encryption to ensure your data safety.

Learn more about CloudBerry Backup: https://www.cloudberrylab.com/managed-backup.aspx

Read Backblaze’s blog post, How To Back Up Your NAS with B2 and CloudBerry.

3 — StarWind VTL

Some organizations still use Virtual Tape Library (VTL), but want to sync their tape objects to the cloud as well.

OS compatibility

This product is available only for Windows.

Cloud Support

So far, StarWind VTL can talk to popular cloud storage platforms like Backblaze B2, AWS S3 and Glacier, and Microsoft Azure.

Feature set

The product has many features for anyone who wants to back up to the cloud. First, it allows sending data to the cloud’s respective tier with their subsequent automatic de-staging. This automation makes StarWind VTL really cool. Second, the product supports both on-premises and public cloud object storage. Third, StarWind VTL supports deduplication and compression, making storage utilization more efficient. This solution also allows client-side encryption.

StarWind supports standard “LTO” (Linear Tape-Open) protocols. This appeals to organizations that have LTO in place since it allows adoption of more scalable, cost efficient cloud storage without having to update the internal backup infrastructure.

All operations in the StarWind VTL environment are done via the Management Console and Web-Console, the web-interface that makes VTL compatible with all browsers.

Learn more about StarWind Virtual Tape Library: https://www.starwindsoftware.com/starwind-virtual-tape-library

Also, see Backblaze’s post on StarWind VTL: Connect Veeam to the B2 Cloud: Episode 2 — Using StarWind VTL

4 — Duplicati

Duplicati is designed for online backups from scratch. It also can send your data directly to multiple clouds or use local storage as a backend.

OS compatibility

It is free and compatible with Windows, macOS, and Linux.

Cloud Support

So far, the solution talks to Backblaze B2, Amazon S3, Mega, Google Cloud Storage, and Microsoft Azure.

Feature set

Duplicati has some awesome features. First, the solution is free. Notably, its team does not restrict using this software for free even for commercial purposes. Second, Duplicati employs decent encryption, compression, and deduplication, making your storage more efficient and safe. Third, the solution adds timestamps to your files, so you can easily find the specific backup. Fourth, the backup scheduler helps make users’ lives simpler. Now, you won’t miss the backup time!

What makes this piece of software special and really handy is backup content verification. Indeed, you never know whether the backup works out until you literally back up from it. Thanks to this feature, you can pinpoint any problems before it is too late.

Duplicati is managed via a web interface, making it possible to access from anywhere and any platform.

Learn more about Duplicati: https://www.duplicati.com/.

Read Backblaze’s post on Duplicati: Duplicati, a Free, Cross-Platform Client for B2 Cloud Storage.

5 — Duplicacy

Duplicacy supports popular public cloud storage services. Apart from the cloud, it can use SFTP servers and NAS boxes as backends.

OS compatibility

The solution is compatible with Windows, Mac OS X, and Linux.

Cloud Support

Duplicacy can offload data to Backblaze B2, Amazon S3, Google Cloud Storage, Microsoft Azure, and more.

Feature set

Duplicacy not only routes your backups to the cloud but also creates them. Note that each backup created by this solution is incremental. Each of them is treated as a full snapshot, allowing simpler restoration, deletion, and backup transition between storage sites. Duplicacy sends your files to multiple cloud storages and uses strong client-side encryption. Another cool thing about this solution is its ability to provide multiple clients with simultaneous access to the same storage.

Duplicacy has a comprehensive GUI that features one-page configuration for quick backup scheduling and managing retention policies. If you are a command-line interface fan, you can manage Duplicacy via the command line.

Learn more about Duplicacy: https://duplicacy.com/

Read Backblaze’s Knowledge Base article, How to upload files to B2 with Duplicacy.

So, Should You Store Your Data in More Than One Cloud?

Undoubtedly, keeping a copy of your data in the public cloud is a great idea and enables you to comply with the 3-2-1 backup rule. By going beyond that and adopting a multi-cloud strategy, it is possible to save money and also gain additional data redundancy and security by having data in more than one public cloud service.

As I’ve covered in this post, there are a number of wonderful backup solutions that can talk to multiple public cloud storage services. I hope this article proves useful to you and you’ll consider employing one of the reviewed solutions in your backup infrastructure.

Kevin Soltow

The post Five Cool Multi-Cloud Backup Solutions You Should Consider appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

What’s the Diff: VMs vs Containers

Post Syndicated from Roderick Bauer original https://www.backblaze.com/blog/vm-vs-containers/

What's the Diff: Containers vs VMs

Both VMs and containers can help get the most out of available computer hardware and software resources. Containers are the new kids on the block, but VMs have been, and continue to be, tremendously popular in data centers of all sizes.

If you’re looking for the best solution for running your own services in the cloud, you need to understand these virtualization technologies, how they compare to each other, and what are the best uses for each. Here’s our quick introduction.

Basic Definitions — VMs and Containers

What are VMs?

A virtual machine (VM) is an emulation of a computer system. Put simply, it makes it possible to run what appear to be many separate computers on hardware that is actually one computer.

The operating systems (“OS”) and their applications share hardware resources from a single host server, or from a pool of host servers. Each VM requires its own underlying OS, and the hardware is virtualized. A hypervisor, or a virtual machine monitor, is software, firmware, or hardware that creates and runs VMs. It sits between the hardware and the virtual machine and is necessary to virtualize the server.

Since the advent of affordable virtualization technology and cloud computing services, IT departments large and small have embraced virtual machines (VMs) as a way to lower costs and increase efficiencies.

Virtual Machine System Architecture Diagram

VMs, however, can take up a lot of system resources. Each VM runs not just a full copy of an operating system, but a virtual copy of all the hardware that the operating system needs to run. This quickly adds up to a lot of RAM and CPU cycles. That’s still economical compared to running separate actual computers, but for some applications it can be overkill.

That led to the development of containers.

Benefits of VMs

  • All OS resources available to apps
  • Established management tools
  • Established security tools
  • Better known security controls
Popular VM Providers

What are Containers?

With containers, instead of virtualizing the underlying computer like a virtual machine (VM), just the OS is virtualized.

Containers sit on top of a physical server and its host OS — typically Linux or Windows. Each container shares the host OS kernel and, usually, the binaries and libraries, too. Shared components are read-only. Sharing OS resources such as libraries significantly reduces the need to reproduce the operating system code, and means that a server can run multiple workloads with a single operating system installation. Containers are thus exceptionally “light” — they are only megabytes in size and take just seconds to start. In contrast, VMs take minutes to run and are an order of magnitude larger than an equivalent container.

In contrast to VMs, all that a container requires is enough of an operating system, supporting programs and libraries, and system resources to run a specific program. What this means in practice is you can put two to three times as many as applications on a single server with containers than you can with a VM. In addition, with containers you can create a portable, consistent operating environment for development, testing, and deployment.

Containers System Architecture Diagram

Types of Containers

Linux Containers (LXC) — The original Linux container technology is Linux Containers, commonly known as LXC. LXC is a Linux operating system level virtualization method for running multiple isolated Linux systems on a single host.

Docker — Docker started as a project to build single-application LXC containers, introducing several changes to LXC that make containers more portable and flexible to use. It later morphed into its own container runtime environment. At a high level, Docker is a Linux utility that can efficiently create, ship, and run containers.

Benefits of Containers

  • Reduced IT management resources
  • Reduced size of snapshots
  • Quicker spinning up apps
  • Reduced & simplified security updates
  • Less code to transfer, migrate, upload workloads
Popular Container Providers

Uses for VMs vs Uses for Containers

Both containers and VMs have benefits and drawbacks, and the ultimate decision will depend on your specific needs, but there are some general rules of thumb.

  • VMs are a better choice for running apps that require all of the operating system’s resources and functionality, when you need to run multiple applications on servers, or have a wide variety of operating systems to manage.
  • Containers are a better choice when your biggest priority is maximizing the number of applications running on a minimal number of servers.
What’s the Diff: VMs vs. Containers
VMs Containers
Heavyweight Lightweight
Limited performance Native performance
Each VM runs in its own OS All containers share the host OS
Host OS can be different than the guest OS Host OS and container OS are the same
Startup time in minutes Startup time in milliseconds
Hardware-level virtualization OS virtualization
Allocates required memory Requires less memory space
Fully isolated and hence more secure Process-level isolation and hence less secure

For most, the ideal setup is likely to include both. With the current state of virtualization technology, the flexibility of VMs and the minimal resource requirements of containers work together to provide environments with maximum functionality.

If your organization is running a large number of instances of the same operating system, then you should look into whether containers are a good fit. They just might save you significant time and money over VMs.

Are you Using VMs, Containers, or Both?

We will explore this topic in greater depth in subsequent posts. If you are using VMs or containers, we’d love to hear from you about what you’re using and how you’re using them.

The post What’s the Diff: VMs vs Containers appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.