Tag Archives: Cloud Storage

How to Build a Multi-cloud Tech Stack for Streaming Media

2021-06-15 Molly Clancy

Post Syndicated from Molly Clancy original https://www.backblaze.com/blog/how-to-build-a-multi-cloud-tech-stack-for-streaming-media/

Backblaze and Kanopy - Thoughtful Entertainment

In most industries, one lost file isn’t a big deal. If you have good backups, you just have to find it, restore it, and move on—business as usual. But in the business of streaming media, one lost file can cause playback issues, compromising the customer experience.

Kanopy, a video streaming platform serving more than 4,000 libraries and 45 million library patrons worldwide, previously used an all-in-one video processing, storage, and delivery provider to manage their media, but reliability became an issue after missing files led to service disruptions.

We spoke with Kanopy’s Chief Technology Officer, Dave Barney, and Lead Video Software Engineer, Pierre-Antoine Tible, to understand how they restructured their tech stack to achieve reliability and three times more redundancy with no increase in cost.

Kanopy: Like Netflix for Libraries
Describing Kanopy as “Netflix for libraries” is an accurate comparison until you consider the number of videos they offer: Kanopy has 25,000+ titles under management, many thousands more than Netflix. Founded in 2008, Kanopy pursued a blue ocean market in academic and, later, public libraries rather than competing with Netflix. The libraries pay for the service, offering patrons free access to films that can’t be found anywhere else in the world.

Kanopy Display Imagery — Kanopy provides thoughtful entertainment that bridges cultural boundaries, sparks discussion, and expands worldviews.

Streaming Media Demands Reliability

In order for a film to be streamed without delays or buffering, it must first be transcoded—broken up into smaller, compressed files known as “chunks.” A feature-length film may translate to thousands of five to 10-second chunks, and losing just one can cause playback issues that disrupt the viewing experience. Pierre-Antoine described a number of reliability obstacles Kanopy faced with their legacy provider:

The provider lost chunks, disabling HD streaming.
The CDN said the data was there—but user complaints made it clear it wasn’t.
Finding source files and re-transcoding them was costly in both time and resources.
The provider didn’t back up data. If the file couldn’t be located in primary storage, it was gone.

Preparing for a Cloud to Cloud Migration

For a video streaming service of Kanopy’s scale, a poor user experience was not acceptable. Nor was operating without a solid plan for backups. To increase reliability and redundancy, they took steps to restructure their tech stack:

First, Kanopy moved their data out of their legacy provider and made it S3 compatible. Their legacy provider used its own storage type, so Pierre-Antoine and Kanopy’s development team wrote a script themselves to move the data to AWS, where they planned to set up their video processing infrastructure.

Next, they researched a few solutions for origin storage, including Backblaze B2 Cloud Storage and IBM. Kanopy streams out 15,000+ titles each month, which would incur massive egress fees through Amazon S3, so it was never an option. Both Backblaze B2 and IBM offered an S3 compatible API, so the data would have been easy to move, but using IBM for storage meant implementing a CDN Kanopy didn’t have experience with.

Then, they ran a proof of concept. Backblaze proved more reliable and gave them the ability to use their preferred CDN, Cloudflare, to continue delivering content around the globe.

Finally, they completed the migration of production data. They moved data from Amazon S3 to Backblaze B2 using Backblaze’s Cloud to Cloud Migration service, moving 150TB in less than three days.

Kanopy team at lunch — Where’s the popcorn? The Kanopy team takes a break.

Building a Tech Stack for Streaming Media

Kanopy’s vendor-agnostic, multi-cloud tech stack provides them the flexibility to use integrated, best-of-breed providers. Their new stack includes:

IBM Aspera to receive videos from contract suppliers like Paramount or HBO.
AWS for transcoding and encryption and Deep Glacier for redundant backups.
Flexify.IO for ongoing data transfer.
Backblaze B2 for origin storage.
Cloudflare for CDN and edge computing.

The Benefits of a Multi-cloud, Vendor-agnostic Tech Stack

The new stack offers Kanopy a number of benefits versus their all-in-one provider:

Since Backblaze is already configured with Cloudflare, data stored on Backblaze B2 automatically feeds into Cloudflare’s CDN. This allows content to live in Backblaze B2, yet be delivered with Cloudflare’s low latency and high speed.
Benefitting from the Bandwidth Alliance, Kanopy pays $0 in egress fees to transfer data from Backblaze to Cloudflare. The Bandwidth Alliance is a group of cloud and networking companies that discount or waive data transfer fees for shared customers.
Egress savings coupled with Backblaze B2’s transparent pricing allowed Kanopy to achieve redundancy at the same cost as their legacy provider.

Scaling a Streaming Media Platform With Backblaze B2

Though reliability was a main driver in Kanopy’s efforts to overhaul their tech stack, looking forward, Dave sees their new system enabling Kanopy to scale even further. “We’re rapidly accelerating the amount of content we onboard. Had reliability not become an issue, cost containment very quickly would have. Backblaze and the Bandwidth Alliance helped us attain both,” Dave attested.

“We’re rapidly accelerating the amount of content we onboard. Had reliability not become an issue, cost containment very quickly would have. Backblaze and the Bandwidth Alliance helped us attain both.”
—Dave Barney, Chief Technology Officer, Kanopy

The post How to Build a Multi-cloud Tech Stack for Streaming Media appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

A Cloud Storage Experiment to Level Up Chia Farming

2021-06-10 Troy Liljedahl

Post Syndicated from Troy Liljedahl original https://www.backblaze.com/blog/experimenting-cloud-storage-for-chia-mining/

Backblaze B2 Cloud Storage

Anyone that pays attention to the hard drive market as closely as Backblaze does already knows all about the rapid rise in popularity of Chia—the “green” alternative to Bitcoin—and the effect it’s having on global drive supplies. If you haven’t heard about Chia, check out the short note below for more info. But this post is geared for the many Chia farmers out there who’ve already delved into farming and are now facing empty shelves as they seek out storage solutions for their plots.

With this shortage in mind, our team set out to explore an experimental solution that would allow for farming of Chia plots stored in B2 Cloud Storage. We’re happy to announce that it is now possible for Chia farmers to store and farm their plots in Backblaze B2.

So if you’re looking to participate in the Chia sensation without spending a lot on hard-to-find, high-capacity hard drives, there is now an innovative way to get started with an affordable, scalable cloud solution.

Chia vs. Bitcoin: Chia is a new cryptocurrency that employs a proof-of-space algorithm. As opposed to the proof-of-work algorithm supporting Bitcoin—which is both CPU and energy intensive—Chia was developed to minimize energy consumption. The result is a storage intensive form of blockchain. If you’d like to learn more, we recommend going to the source.

The Keys to Winning Chia Challenges

Chia plots are not just lying idle. The Chia network regularly issues matching challenges and quality checks. The quality checks are important to succeed at, but the challenges—where one of every 512 plots is challenged every 10 minutes—are why you’re farming.

If one of your plots is selected for a match challenge, you need to fetch a “full proof” to collect a reward, which requires around 64 hard drive seeks and delivery of the full proof to the rest of the peer-to-peer network in less than 30 seconds, before Chia “time lords” move the blockchain further along.

This presents two problems that might keep you up at night if you’re trying to farm Chia:

Problem 1: Where to store plots at scale.
Given that the current estimated network space occupied by Chia plots is 20 exabytes (and growing exponentially!), chance dictates that just one of your plots will emerge as the winner once in about 96 years. It’s like waiting a lifetime for an ear of corn—not fun. So you want to have a lot of plots to improve your odds—but you need somewhere to keep them that you can afford and that can grow with your farming.
Problem 2: Administering the complexity of scaling storage.
If you solve the storage problem, then you also need a way to quickly and reliably make all of the plots available to be read and quickly presented to the network when you win a challenge. You’ll need to be able to administer that complexity every second of every day for as long as you want to be a farmer. If you wait 96 years for a single ear of corn, it would be a bummer to miss harvest day.

These are the keys to winning the match challenge: Attaining scale and capably administering it.

The Status Quo: Individual Chia Farmers Using HDDs for Storage

For a 7200 RPM HDD with an approximately 10ms read latency, getting a quality check or a full proof takes around 70ms per qualifying plot. Because the Chia kernel caches the first seven reads, the HDD only must perform the 64 seeks when issued a challenge.

If an 18TB drive—which can hold 166 plots at 108GB per plot (using k=32)—is lucky enough to contain a plot that is that magical “one in 512,” the HDD is reasonably fast in performing the necessary read operations, because Chia was designed to use HDDs for Plot Farming. But HDDs can only perform one of those operations at a time, so a desktop must perform the operations sequentially. Even if you’re using an SSD, you still must perform the operations in series. Again, this isn’t an issue for individual drives since HDDs and SSDs are able to perform the operations very quickly within the allotted time frame.

But, even for those lucky enough to find a supply of readily available 18TB drives that haven’t been marked up twice, providing storage for the number of plots a Chia Farmer needs to ensure a reasonable chance for success is going to be labor and capital intensive.

rows of Backblaze storage pods

How to Use Cloud Storage to Scale Your Plots

Chia software was not designed to allow farming with public cloud object storage, and the first tests we ran on Chia plots stored in B2 Cloud Storage proved this out: taking minutes, not the 30 seconds necessary to pass the quality check in time. Unlike with a local storage solution, where quality check reads can be cached by the kernel, performance is degraded in a cloud storage setup to an extent that it affects users’ success rate of winning challenges.

Backblaze B2 Cloud Storage provides object storage, which stores data in discrete objects, negating the need for any nested or hierarchical file structure. This makes B2 Cloud Storage ideal for scaling and use as an origin store, but as a standalone product, object storage is not suited for storing Chia plots. Without caching optimizations to improve performance and a way to read plots concurrently, B2 Cloud Storage would not effectively serve the Chia farming use case. But B2 Cloud Storage is designed to take advantage of parallel operations, or threads, offering some advantages over a standard physical drive if set up correctly for this use case (cough* I wrote about threads here! cough*).

Our team thought it would be interesting to build a tool providing a Chia use case workaround for four compelling reasons:

First: Because the Backblaze Storage Cloud provides both of the keys for successful Chia farming: There is no provisioning necessary and Chia Farmers can upload new plots at speed and scale. The Backblaze Storage Cloud cares for nearly 500 billion files with exceptional durability and availability.
Second: The cost of storing Chia plots in Backblaze B2 is financially compelling at $5/TB/month. According to Chia Calculator, using B2 Cloud Storage to store plots would be profitable, depending on the network space growth rate and the current price of Chia coin.
Third: A Tiger Team of SEs and engineers, including myself, thought it would make for an interesting and useful (and fun) experiment.
Finally: The same team believed we could enable Chia farming of plots stored in B2 Cloud Storage by cracking the code of how to parallelize operations in Chia.

With this in mind, our Tiger Team set out to get to work. A tool to mount Backblaze B2 as a filesystem was necessary since Chia doesn’t natively support the Backblaze B2 Native or S3 Compatible APIs. After some testing, our team settled on B2_fuse since our engineers who would be working on this already had some familiarity with the source code.

After deciding on B2_fuse, our engineers added a prefetch algorithm to cache the reads to address the kernel issue mentioned above. This would aid performance, but with reads still carried out one at a time on HDD, there was room for additional improvement. Obviously performing operations in parallel would greatly grow the success rate, and after doing some digging one of our engineers found a PR (pull request) that added parallelized reads and had not yet been merged into the Chia project.

With the caching optimizations in B2_fuse and the added functionality of parallelized reads, the proof time for a Chia plot stored in B2 Cloud Storage was reduced to seconds. This provides for uploading Chia plots to Backblaze B2 and presenting them to the Chia network for farming without the need for an expensive server in a datacenter.

Our successful tests were carried out using a compute instance running in a US West region with a Backblaze B2 account that is also in the US West region. Give it a shot and you could be staring at a whole field of metaphorical crops—all ready for whenever the “one in 512” challenge arrives.

If you want to try this solution, set up a Backblaze B2 account now, and get the updated version of B2_fuse (or contribute to the project) along with instructions on how to get the PR with parallized reads here: https://github.com/Backblaze-B2-Samples/b2fs4chia.

Because this support is experimental and the Backblaze team knows a lot of Chia Farmers will be excited to try it out, we ask that farmers limit their storage of Chia plots to 100 TB at this time, or contact our sales team to discuss anything larger.

Happy farming!

The post A Cloud Storage Experiment to Level Up Chia Farming appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Backblaze Terraform Provider Changes the Game for Avisi

2021-05-28 Molly Clancy

Post Syndicated from Molly Clancy original https://www.backblaze.com/blog/backblaze-terraform-provider-changes-the-game-for-avisi/

Backblaze + Avisi Apps

Recently, we announced that Backblaze B2 Cloud Storage published a provider to the Terraform registry to support developers in their infrastructure as code (IaC) efforts. With the Backblaze Terraform provider, you can provision and manage B2 Cloud Storage resources directly from a Terraform configuration file.

Today’s post grew from a comment in our GitHub repository from Gert-Jan van de Streek, Co-founder of Avisi, a Netherlands-based software development company. That comment sparked a conversation that turned into a bigger story. We spoke with Gert-Jan to find out how the Avisi team practices IaC processes and uses the Backblaze Terraform provider to increase efficiency, accuracy, and speed through the DevOps lifecycle. We hoped it might be useful for other developers considering IaC for their operations.

What Is Infrastructure as Code?

IaC emerged in the late 2000s as a response to the increasing complexity of scaling software developments. Rather than provisioning infrastructure via a provider’s user interface, developers can design, implement, and deploy infrastructure for applications using the same tools and best practices they use to write software.

Provisioning Storage for “Apps That Fill Gaps”

The team at Avisi likes to think about software development as a sport. And their long-term vision is just as big and audacious as an Olympic contender’s—to be the best software development company in their country.

Gert-Jan co-founded Avisi in 2000 with two college friends. They specialize in custom project management, process optimization, and ERP software solutions, providing implementation, installation and configuration support, integration and customization, and training and plugin development. They built the company by focusing on security, privacy, and quality, which helped them to take on projects with public utilities, healthcare providers, and organizations like the Dutch Royal Notarial Professional Organization—entities that demand stable, secure, and private production environments.

They bring the same focus to product development, a business line Gert-Jan leads where they create “apps that fill gaps.” He coined the tagline to describe the apps they publish on the Atlassian and monday.com marketplaces. “We know that a lot of stuff is missing from the Atlassian and monday.com tooling because we use it in our everyday life. Our goal in life is to provide that missing functionality—apps to fill gaps,” he explained.

Avisi application platforms - Confluence, Jira, Bitbucket, Monday.com, GitLab — Avisi’s applications fill the gaps in popular project management solutions.

With multiple development environments for each application, managing storage becomes a maintenance problem for sophisticated DevOps teams like Avisi’s. For example, let’s say Gert-Jan has 10 apps to deploy. Each app has test, staging, and production environments, and each has to be deployed in three different regions. That’s 90 individual storage configurations, 90 opportunities to make a mistake, and 90 times the labor it takes to provision one bucket.

Infrastructure in Sophisticated DevOps Environments: An Example

10 apps x three environments x three regions = 90 storage configurations

Following DevOps best practices means Avisi writes reusable code, eliminating much of the manual labor and room for error. “It was really important for us to have IaC so we’re not clicking around in user interfaces. We need to have stable test, staging, and production environments where we don’t have any surprises,” Gert-Jan explained.

Terraform vs. CloudFormation

Gert-Jan had already been experimenting with Terraform, an open-source IaC tool developed by HashiCorp, when the company decided to move some of their infrastructure from Amazon Web Services (AWS) to Google Cloud Platform (GCP). The Avisi team uses Google apps for business, so the move made configuring access permissions easier.

Of course, Amazon and Google don’t always play nice—CloudFormation, AWS’s proprietary IaC tool, isn’t supported across the platforms. Since Terraform is open-source, it allowed Avisi to implement IaC with GCP and a wide range of third-party integrations like StatusCake, a tool they use for URL monitoring.

Backblaze B2 + Terraform

Simultaneously, when Avisi moved some of their infrastructure from AWS to GCP, they resolved to stand up an additional public cloud provider to serve as off-site storage as part of a 3-2-1 strategy (three copies of data on two different media, with one off-site). Gert-Jan implemented Backblaze B2, citing positive reviews, affordability, and the Backblaze European data center as key decision factors. Many of Avisi’s customers reside in the European Union and are often subject to data residency requirements that stipulate data must remain in specific geographic locations. Backblaze allowed Gert-Jan to achieve a 3-2-1 strategy for customers where data residency in the EU is top of mind.

When Backblaze published a provider to the Terraform registry, Avisi started provisioning Backblaze B2 storage buckets using Terraform immediately. “The Backblaze module on Terraform is pure gold,” Gert-Jan said. “It’s about five lines of code that I copy from another project. I configure it, rename a couple variables, and that’s it.”

Real-time Storage Sync With Terraform

Gert-Jan wrote the cloud function to sync between GCP and Backblaze B2 in Clojure, a functional programming language, running on top of Node.js. Clojure compiles to Javascript, so it runs in Java environments as well as Node.js or browser environments, for example. That means the language is available on the server side as well as the client side for Avisi.

The cloud function allowed off-site tiering to be almost instantaneous. Now, every time a file is written, it gets picked up by the cloud function and transferred to Backblaze in real time. “You need to feel comfortable about what you deploy and where you deploy it. Because it is code, the Backblaze Terraform provider does the work for me. I trust that everything is in place,” Gert-Jan said.

Avisi meeting room — The Avisi team at work.

Easier Lifecycle Rules and Code Reviews

In addition to reducing manual labor and increasing accuracy, the Backblaze Terraform provider makes setting lifecycle rules to comply with control frameworks like the General Data Protection Regulations (GDPR) and SOC 2 requirements much simpler. Gert-Jan configured one reusable module that meets the regulations and can apply the same configurations to each project. In a SOC 2 audit or when savvy customers want to know how their data is being handled, he can simply provide the code for the Backblaze B2 configuration as proof that Avisi is retaining and adequately encrypting backups rather than sending screenshots of various UIs.

Using Backblaze via the Terraform provider also streamlined code reviews. Prior to the Backblaze Terraform provider, Gert-Jan’s team members had less visibility into the storage set up and struggled with ecosystem naming. “With the Backblaze Terraform provider, my code is fully reviewable, which is a big plus,” he explained.

Simplifying Storage Management

Embracing IaC practices and using the Backblaze Terraform provider specifically means Gert-Jan can focus on growing the business rather than setting up hundreds of storage buckets by hand. He saves about eight hours per environment. Based on the example above, that equates to 720 hours saved all told. “Terraform and the Backblaze module reduced the time I spend on DevOps by 75% to just a couple of hours per app we deploy, so I can take care of the company while I’m at it,” he said.

If you’re interested in stepping up your DevOps game with IaC, set up a bucket in Backblaze B2 for free and start experimenting with the Backblaze Terraform provider.

The post Backblaze Terraform Provider Changes the Game for Avisi appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

MSP360 and Backblaze: When Two Panes Are Greater Than One

2021-03-05 Molly Clancy

Post Syndicated from Molly Clancy original https://www.backblaze.com/blog/msp360-and-backblaze-when-two-panes-are-greater-than-one/

IT departments are tasked with managing an ever-expanding suite of services and vendors. With all that, a solution that offers a “single pane of glass” can sound like sweet relief. Everything in one place! Think of the time savings! Easy access. Consolidated user management. Centralized reporting. In short, one solution to rule them all.

But solutions that wrangle your tech stack into one comprehensive dashboard risk adding unnecessary levels of complexity in the name of convenience and adding fees for functions you don’t need. That “single pane of glass” might have you reaching for the Windex come implementation day.

While it feels counterintuitive, pairing two different services that each do one thing and do it very well can offer an easier, low-touch solution in the long term. This post highlights how one managed service provider (MSP) configured a multi-pane solution to manage backups for 6,000+ endpoints on 500+ servers at more than 450 dental and doctor’s offices in the mid-Atlantic region.

The Trouble With a “Single Pane of Glass”

Nate Smith, Technical Project Manager, DTC.

Nate Smith, Technical Project Manager for DTC, formerly known as Dental Technology Center, had a data dilemma on his hands. From 2016 to 2020, DTC almost doubled their client base, and the expense of storing all their customers’ data was cutting into their budget for improvements.

“If we want to become more profitable, let’s cut down this $8,000 per month AWS S3 bill,” Nate reasoned.

In researching AWS alternatives, Nate thought he found the golden ticket—a provider offering both object and compute storage in that proverbial “single pane of glass.” At $0.01/GB, it was more expensive than standard object storage, but the anticipated time savings of managing resources with a single vendor was worth the extra cost for Nate—until it wasn’t.

DTC successfully tested the integrated service with a small number of endpoints, but the trouble started when they attempted migrating more than 75-80 endpoints. Then, the failures began rolling in every night—backups would time out, jobs would retry and fail. There were time sync issues, foreign key errors, remote socket errors, and not enough spindles—a whole host of problems.

How to Recover When the “Single Pane of Glass” Shatters

Nate worked with the provider’s support team, but after much back and forth, it turned out the solution he needed would take a year and a half of development. He gave the service one more shot with the same result. After spending 75 hours trying to make it work, he decided to start looking for another option.

Evaluate Your Cloud Landscape and Needs

Nate and the DTC team decided to keep the integrated provider for compute storage. “We’re happy to use them for infrastructure as a service over something like AWS or Azure. They’re very cost-effective in that regard,” he explained. He just needed object storage that would work with MSP360—their preferred backup software—and help them increase margins.

Knowing he might need an out should the integrated provider fail, he had two alternatives in his back pocket—Backblaze and Wasabi.

Do the Math to Compare Cloud Providers

At first glance, Wasabi looked more economical based on the pricing they highlight, but after some intense number crunching, Nate estimated that Wasabi’s 90-day minimum storage retention policy potentially added up to $0.015/GB given DTC’s 30-day retention policy.

Egress wasn’t the only scenario Nate tested. He also ran total loss scenarios for 10 clients comparing AWS, Backblaze B2 Cloud Storage, and Wasabi. He even doubled the biggest average data set size to 4TB just to overestimate. “Backblaze B2 won out every single time,” he said.

Fully loaded costs from AWS totalled nearly $100,000 per year. With Backblaze B2, their yearly spend looked more like $32,000. “I highly recommend anyone choosing a provider get detailed in the math,” he advised—sage words from someone who’s seen it all when it comes to finding reliable object storage.

Try Cloud Storage Before You Buy (to Your Best Ability)

Building the infrastructure for testing in a local environment can be costly and time-consuming. Nate noted that DTC tested 10 endpoints simultaneously back when they were trying out the integrated provider’s solution, and it worked well. The trouble started when they reached higher volumes.

Another option would have been running tests in a virtual environment. Testing in the cloud gives you the ability to scale up resources when needed without investing in the infrastructure to simulate thousands of users. If you have more than 10GB, we can work with you to test a proof of concept.

For Nate, because MSP360 easily integrates with Backblaze B2, he “didn’t have to change a thing” to get it up and running.

Phase Your Data Migration

Nate planned on phasing from the beginning. Working with Backblaze, he developed a region-by-region schedule, splitting any region with more than 250TB into smaller portions. The reason? “You’re going to hit a point where there’s so much data that incremental backups are going to take longer than a day, which is a problem for a 24/7 operation. I would parse it out around 125TB per batch if anyone is doing a massive migration,” he explained.

DTC migrated all its 450 clients—nearly 575TB of data—over the course of four weeks using Backblaze’s high speed data transfer solution. According to Nate, it sped up the project tenfold.

An Easy Multi-Pane Approach to Cloud Storage

Using Backblaze B2 for object storage, MSP360 for backup management, and another provider for compute storage means Nate lost his “single pane” but killed a lot of pain in the process. He’s not just confident in Backblaze B2’s reliability, he can prove it with MSP360’s consistency checks. The results? Zero failures.

The benefits of an “out of the box” solution that requires little to no interfacing with the provider, is easy to deploy, and just plain works can outweigh the efficiencies a “single pane of glass” might offer:

No need to reconfigure infrastructure. As Nate attested, “If a provider can’t handle the volume, it’s a problem. My lesson learned is that I’m not going to spend 75 hours again trying to reconfigure our entire platform to meet the object storage needs.”
No lengthy issue resolution with support to configure systems.
No need to learn a complicated new interface. When comparing Backblaze’s interface to AWS, Nate noted that “Backblaze just tells you how many objects you have and how much data is there. Simplicity is a time saver, and time is money.”

Many MSPs and small to medium-sized IT teams are giving up on the idea of a “single pane of glass” altogether. Read more about how DTC saved $68,000 per year and sped up implementation time by 55% by prioritizing effective, simple, user-friendly solutions.

The post MSP360 and Backblaze: When Two Panes Are Greater Than One appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

iconik Media Stats: Top Takeaways From the Annual Report

2021-02-23 Molly Clancy

Post Syndicated from Molly Clancy original https://www.backblaze.com/blog/iconik-media-stats-top-takeaways-from-the-annual-report/

The world looks a lot different than it did when we published our last Media Stats Takeaways, which covered iconik’s business intelligence report from the beginning of last year. It’s likely no big surprise that the use of media management tech has changed right along with other industries that saw massive disruption since the arrival of COVID-19. But iconik’s 2021 Media Stats Report digs deeper into the story, and the detail here is interesting. Short story? The shift to remote work drove an increase in cloud-based solutions for businesses using iconik for smart media management.

Always game to geek out over the numbers, we’re again sharing our top takeaways and highlighting key lessons we drew from the data.

iconik is a cloud-based content management and collaboration app and Backblaze integration partner. Their Media Stats Report series gathers data on how customers store and use data in iconik and what that customer base looks like.

Takeaway 1: Remote Collaboration Is Here to Stay

In 2020, iconik added 12.1PB of data to cloud storage—up 490%. Interestingly, while there was an 11.6% increase in cloud data year-over-year (from 53% cloud/47% on-premises in 2019, to 65% cloud/35% on-premises in 2020), it was down from a peak of 70%/30% mid-year. Does this represent a subtle pendulum swing back towards the office for some businesses and industries?

Either way, the shift to remote work likely changed the way data is handled for the long term no matter where teams are working. Tools like iconik help companies bridge on-premises and cloud storage, putting the focus on workflows and allowing companies to reap the benefits of both kinds of storage based on their needs—whether they need fast access to local shared storage, affordable scalability and collaboration in the cloud, or both.

Takeaway 2: Smaller Teams Took the Lead in Cloud Adoption

Teams of six to 19 people were iconik’s fastest growing segment in 2020 in terms of size, increasing 171% year-over-year. Small teams of one to five came in at a close second, growing 167%.

Adjusting to remote collaboration likely disrupted the inertia of on-premises process and culture in teams of this size, removing any lingering fear around adopting new technologies like iconik. Whether it was the shift to remote work or just increased comfort and familiarity with cloud-based solutions, this data seems to suggest smaller teams are capitalizing on the benefits of scalable solutions in the cloud.

Takeaway 3: Collaboration Happens When Collaborating Is Easy

iconik noted that many small teams of one to five people added users organically in 2020, graduating to the next tier of six to 19 users.

This kind of organic growth indicates small teams are adding users they may have hesitated to include with previous solutions whether due to cost, licensing, or complicated onboarding. Because iconik is delivered via an internet portal, there’s no upfront investment in software or a server to run it—teams just pay for the users and storage they need. They can start small and add or remove users as the team evolves, and they don’t pay for inactive users or unused storage.

We also believe efficient workflows are fueling new business, and small teams are happily adding headcount. Bigger picture, it shows that when adding team members is easy, teams are more likely to collaborate and share content in the production process.

Takeaway 4: Public Sector and Nonprofit Entities Are Massive Content Producers

Last year, we surmised that “every company is a media company.” This year showed the same to be true. Public/nonprofit was the second largest customer segment behind media and entertainment, comprising 14.5% of iconik’s customer base. The segment includes organizations like houses of worship (6.4%), colleges and universities (4%), and social advocacy nonprofits (3.4%).

With organizations generating more content from video to graphics to hundreds of thousands of images, wrangling that content and making it accessible has become ever more important. Today, budget-constrained organizations need the same capabilities of an ad agency or small film production studio. Fortunately, they can deploy solutions like iconik with cloud storage tapping into sophisticated workflow collaboration without investing in expensive hardware or dealing with complicated software licensing.

Takeaway 5: Customers Have the Benefit of Choice for Pairing Cloud Storage With iconik

In 2020, we shared a number of stories of customers adopting iconik with Backblaze B2 Cloud Storage with notable success. Complex Networks, for example, reduced asset retrieval delays by 100%. It seems like these stories did reflect a trend, as iconik flagged that data stored by Backblaze B2 grew by 933%, right behind AWS at 1009% and well ahead of Google Cloud Platform at 429%.

We’re happy to be in good company when it comes to serving the storage needs of iconik users who are faced with an abundance of choice for where to store the assets managed by iconik. And even happier to be part of the customer wins in implementing robust cloud-based solutions to solve production workflow issues.

2020 Was a Year

This year brought changes in almost every aspect of business and…well, life. iconik’s Media Stats Report confirmed some trends we all experienced over the past year as well as the benefits many companies are realizing by adopting cloud-based solutions, including:

The prevalence of remote work and remote-friendly workflows.
The adoption of cloud-based solutions by smaller teams.
Growth among teams resulting from easy cloud collaboration.
The emergence of sophisticated media capabilities in more traditional industries.
The prevalence of choice among cloud storage providers.

As fellow data obsessives, we’re proud to call iconik a partner and curious to see what learnings we can gain from their continued reporting on media tech trends. Jump in the comments to let us know what conclusions you drew from the stats.

The post iconik Media Stats: Top Takeaways From the Annual Report appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Backblaze Mobile Update: iOS and Android Bucket Management

2021-02-18 Jeremy Milk

Post Syndicated from Jeremy Milk original https://www.backblaze.com/blog/backblaze-mobile-update-ios-mobile-uploads/

This post was originally published on February 18, 2021 and has been updated to reflect the newest functionality releases for Backblaze Mobile users on both iOS and Android.

Ready to update now? Go to Google Play or the App Store to run updates or download the Backblaze app.

December 20, 2022: Mobile 6.0 Is Available

Today, we’re announcing the arrival of Backblaze Mobile 6.0 featuring an enhanced visual experience, authentication improvements, bug fixes, and many design updates. Check out the specifics below.

What’s New in Backblaze Mobile 6.0?

Backblaze Mobile 6.0 features an overhauled visual experience (so fresh, so clean!).

The update also features authentication enhancements for both iOS and Android. We’ve made it easier to log in and opt to see your password in plain text as you enter it. We’ve also optimized the stability of our mobile login flow.

iOS Updates

Design updates: Redesigned login and settings screens, updated icons, and improved upload/download progress animations.
Login updates: Email and password now appear on the same screen when logging in, and you can choose to see your password in plain text as you enter it.
Viewing and previewing files: You can now view downloaded files in full-screen mode on iPhones as well as iPads.
SwiftUI is here: Much of the iOS code has been migrated to use SwiftUI and The Composable Architecture.
Bug fixes and performance improvements: A lot has been tightened up under the hood, including fixing a file download timeout issue and progress messaging display issues.

Android Updates

Design updates: A fresh UI and navigation experience comes courtesy of updated material libraries.
Navigation and controls: We’ve also advanced the Android navigation bar, scrollable header and footers, and updated gesture controls for a better Android experience. You can now also see the file path for any file uploaded to Computer Backup or B2 Cloud Storage files.
Edit mode and selection capabilities: Navigation and maneuvering inside of edit mode for files, buckets, folders, and downloads has also been improved. We’ve also added multiselection capabilities and swipe-to-delete functionality.

Backblaze Mobile 6.0 Available Now: Download Today

To get the latest and greatest Backblaze Mobile experience, update your apps or download them today on Google Play or the App Store.

March 28, 2022: Added Folder Creation

Backblaze Mobile users on iOS and Android devices can now create folders directly on their devices with our latest app update. The update is generally available the week of March 27, 2022 for both iOS and Android platforms.

The functionality expands on previous releases to allow users to more easily work from their mobile devices.

November 30, 2021: Added Bucket Creation and Bucket, Folder, and File Deletion

With this update, Backblaze Mobile users on iOS and Android devices can create buckets and delete buckets, folders, and files directly on their devices.

If you routinely work from your mobile device, this means you’ll be able to better manage your cloud storage while you’re away from your workstation. For media and entertainment pros who regularly shoot images and footage on powerful smart devices, for example, this functionality allows you to create buckets for new projects from the field. And if you need to delete a bucket, file, or folder, you can do that on the go, too. With this functionality at your fingertips, you can focus on shooting, producing, and doing more with ease rather than waiting until you’re back at your desktop or laptop to handle organizational tasks.

The update also included bug fixes and an upgrade to Android 11.

Older Releases

In case you missed the last few releases, Backblaze Mobile allows iOS and Android users to preview and download content through the app and upload files directly to Backblaze B2 Cloud Storage buckets.

The post Backblaze Mobile Update: iOS and Android Bucket Management appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Level Up and SIMMER.io Down: Scaling a Game-sharing Platform

2021-02-17 Molly Clancy

Post Syndicated from Molly Clancy original https://www.backblaze.com/blog/level-up-and-simmer-io-down-scaling-a-game-sharing-platform/

Much like gaming, starting a business means a lot of trial and error. In the beginning, you’re just trying to get your bearings and figure out which enemy to fend off first. After a few hours (or a few years on the market), it’s time to level up.

SIMMER.io, a community site that makes sharing Unity WebGL games easy for indie game developers, leveled up in a big way to make their business sustainable for the long haul.

When the site was founded in September 2017, the development team focused on getting the platform built and out the door, not on what egress costs would look like down the road. As it grew into a home for 80,000+ developers and 30,000+ games, though, those costs started to encroach on their ability to sustain and grow the business.

After rolling the dice in “A Hexagon’s Adventures” a few times (check it out below), we spoke with the SIMMER.io development team about their experience setting up a multi-cloud solution—including their use of the Bandwidth Alliance between Cloudflare and Backblaze B2 Cloud Storage to reduce egress to $0—to prepare the site for continued growth.

How to Employ a Multi-cloud Approach for Scaling a Web Application

In 2017, sharing games online with static hosting through a service like AWS S3 was possible but certainly not easy. As one SIMMER.io team member put it, “No developer in the world would want to go through that.” The team saw a clear market opportunity. If developers had a simple, drag-and-drop way to share games that worked for them, the site would get increased traffic that could be monetized through ad revenue. Further out, they envisioned a premium membership offering game developers unbranded sharing and higher bandwidth. They got to work building the infrastructure for the site.

Prioritizing Speed and Ease of Use

Starting a web application, your first priority is planning for speed and ease of use—both for whatever you’re developing but also from the apps and services you use to develop it.

The team at SIMMER.io first tried setting up their infrastructure in AWS. They found it to be powerful, but not very developer-friendly. After a week spent trying to figure out how to implement single sign-on using Amazon Cognito, they searched for something easier and found it in Firebase—Google’s all-in-one development environment. It had most of the tools a developer might need baked in, including single sign-on.

Firebase was already within the Google suite of products, so they used Google Cloud Platform (GCP) for their storage needs as well. It all came packaged together, and the team was moving fast. Opting into GCP made sense in the moment.

“The Impossible Glide,” E&T Studios. Trust us, it does feel a little impossible.

When Egress Costs Boil Over

Next, the team implemented Cloudflare, a content delivery network, to ensure availability and performance no matter where users access the site. When developers uploaded a game, it landed in GCP, which served as SIMMER.io’s origin store. When a user in Colombia wanted to play a game, for example, Cloudflare would call the game from GCP to a server node that’s geographically closer to the user. But each time that happened, GCP charged egress fees for data transfer out.

Even though popular content was cached on the Cloudflare nodes, egress costs from GCP still added up, comprising two-thirds of total egress. At one point, a “Cards Against Humanity”-style game caught on like wildfire in France, spiking egress costs to more than double their average. The popularity was great for attracting new SIMMER.io business but tough on the bottom line.

These costs increasingly ate into SIMMER.io’s margins until the development team learned of the Bandwidth Alliance, a group of cloud and networking companies that discount or waive data transfer fees for shared customers, of which Backblaze and Cloudflare are both members.

“Dragon Spirit Remake,” by Jin Seo, one of 30K+ games available on SIMMER.io.

Testing a Multi-cloud Approach

Before they could access Bandwidth Alliance savings, the team needed to make sure the data could be moved safely and easily and that the existing infrastructure would still function with the game data living in Backblaze B2.

The SIMMER.io team set up a test bucket for free, integrated it with Cloudflare, and tested one game—Connected Towers. The Backblaze B2 test bucket allows for free self-serve testing up to 10GB, and Backblaze offers a free proof of concept working with our solutions engineers for larger tests. When one game worked, the team decided to try it with all games uploaded to date. This would allow them to cash in on Bandwidth Alliance savings between Cloudflare and Backblaze B2 right away while giving them time to rewrite the code that governs uploads to GCP later.

Choose Your Own Adventure: Migrate Yourself or With Support

Getting 30,000+ games from one cloud provider to another seemed daunting, especially given that games are accessed constantly on the site. They wanted to ensure any downtime was minimal. So the team worked with Backblaze to plan out the process. Backblaze solution engineers recommended using rclone, an open-source command line program that manages files on cloud storage, and the SIMMER.io team took it from there.

With rclone running on a Google Cloud server, the team copied game data uploaded prior to January 1, 2021 to Backblaze B2 over the course of about a day and a half. Since the games were copied rather than moved, there was no downtime at all. The SIMMER.io team just pointed Cloudflare to Backblaze B2 once the copy job finished.

Wood Cutter Santa and Revolver Duels — Left: “Wood Cutter Santa,” Zathos; Right: “Revolver—Duels,” Zathos. “Wood Cutter Santa:” A Backblaze favorite.

Combining Microservices Translates to Ease and Affordability

Now, Cloudflare pulls games on-demand from Backblaze B2 rather than GCP, bringing egress costs to $0 thanks to the Bandwidth Alliance. SIMMER.io only pays for Backblaze B2 storage costs at $5/TB.

For the time being, developers still upload games to GCP, but Backblaze B2 functions as the origin store. The games are mirrored between GCP and Backblaze B2, and to ensure fidelity between the two copies, the SIMMER.io team periodically runs an rclone sync. It performs a hash check on each file to look for changes and only uploads files that have been changed so SIMMER.io avoids paying any more egress than they have to from GCP. For users, there’s no difference, and the redundancy gives SIMMER.io peace of mind while they finish the transition process.

Moving forward, SIMMER.io has the opportunity to rewrite code so game uploads go directly to Backblaze B2. Because Backblaze offers S3 Compatible APIs, the SIMMER.io team can use existing documentation to accomplish the code rework, which they’ve already started testing. Redirecting uploads would further reduce their costs by eliminating duplicate storage, but mirroring the data using rclone was the first step towards that end.

Managing everything in one platform might make sense starting out—everything lives in one place. But, like SIMMER.io, more and more developers are finding a combination of microservices to be better for their business, and not just based on affordability. With a vendor-agnostic environment, they achieve redundancy, capitalize on new functionality, and avoid vendor lock-in.

“AmongDots,” RETRO2029. For the retro game enthusiasts among us.

A Cloud to Cloud Migration Pays Off

For now, by reclaiming their margins through reducing egress costs to $0, SIMMER.io can grow their site without having to worry about increasing egress costs over time or usage spikes when games go viral. By minimizing that threat to their business, they can continue to offer a low-cost subscription and operate a sustainable site that gives developers an easy way to publish their creative work. Even better, they can use savings to invest in the SIMMER.io community, hiring more community managers to support developers. And they also realized a welcome payoff in the process—finally earning some profits after many years of operating on low margins.

Leveling up, indeed.

Check out our Cloud to Cloud Migration offer and other transfer partners—we’ll pay for your data transfer if you need to move more than 50TB.

Bonus Points: Roll the Dice for Yourself

The version of “A Hexagon’s Adventures” below is hosted on B2 Cloud Storage, served up to you via Cloudflare, and delivered easily by virtue of SIMMER.io’s functionality. See how it all works for yourself, and test your typing survival skills.

The post Level Up and SIMMER.io Down: Scaling a Game-sharing Platform appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

On-prem to Cloud, Faster: Meet Our Newest Fireball

2021-02-12 Jeremy Milk

Post Syndicated from Jeremy Milk original https://www.backblaze.com/blog/on-prem-to-cloud-faster-meet-our-newest-fireball/

We’re determined to make moving data into cloud storage as easy as possible for you, so today we are releasing the latest improvement to our data migration pathways: a bigger, faster Backblaze Fireball.

The new Fireball increases capacity for the rapid ingest service from 70TB to 96TB and connectivity speed from 1 Gb/s to 10 Gb/s so that businesses can move larger data sets and media libraries from on-premises to the Backblaze Storage Cloud faster than before.

What Hasn’t Changed

The service is still drop-dead simple. Data is secure and encrypted during the transfer process, and you gain the benefits of the cloud without having to navigate the constraints (and sluggishness) of internet bandwidth. We’re still happy to send you two, or three, or more Fireballs as needed—you can order whatever you need right from your Backblaze B2 Cloud Storage account. Easy.

How It Works

The customer favorite (of folks like Austin City Limits and Yoga International) service works like this: We ship you the Fireball, you copy on-premises data to it directly or through the transfer tool of your choice, you send the Fireball back to us, and we quickly upload your data into your B2 Cloud Storage account.

The Fireball is not right for everyone—organizations already storing to public clouds now frequently use our cloud to cloud migration solution, while those with small, local data sets often find internet transfer tools more than sufficient. For a refresher, definitely check out this “Pathways to the Cloud” guide.

Don’t Be Afraid to Ask

However you’d like to join us, we’re here to help. So—shameless plug alert—please don’t hesitate to contact our Sales team to talk about how to best start saving with B2 Cloud Storage.

The post On-prem to Cloud, Faster: Meet Our Newest Fireball appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Finding a 1Up When Free Cloud Credits Run Out

2021-02-10 Amrit Singh

Post Syndicated from Amrit Singh original https://www.backblaze.com/blog/finding-a-1up-when-free-cloud-credits-run-out/

For people in the early stages of development, a cloud storage provider that offers free credits might seem like a great deal. And diversified cloud providers do offer these kinds of promotions to help people get started with storing data: Google Cloud Free Tier and AWS Free Tier offer credits and services for a limited time, and both providers also have incentive funds for startups which can be unlocked through incubators that grant additional credits of up to tens of thousands of dollars.

Before you run off to give them a try though, it’s important to consider the long-term realities that await you on the far side of these promotions.

The reality is that once they’re used up, budget items that were zeros yesterday can become massive problems tomorrow. Twitter is littered with countless experiences of developers finding themselves surprised with an unexpected bill and the realization that they need to figure out how to navigate the complexities of their cloud provider—fast.

we made the unfortunate mistake (and I'm sure this is how they get you) of not watching our cloud costs so when the generous credits ran out we were hit with big bills until we did major refactoring. Lessons learned early on

— Yuan Gao (@mesetatron) December 5, 2020

What to Do When You Run Out of Free Cloud Storage Credits

So, what do you do once you’re out of credits? You could try signing up with different emails to game the system, or look into getting into a different incubator for more free credits. If you plan on your app being around for a few years and succeeding, the solution of finding more credits isn’t scalable, and the process of applying to another incubator would take too long. You can always switch from Google Cloud Platform to AWS to get free credits elsewhere, but transferring data between providers almost always incurs painful egress charges.

If you’re already sure about taking your data out of your current provider, read ahead to the section titled “Cloud to Cloud Migration” to learn how transferring your data can be easier and faster than you think.

Because chasing free credits won’t work forever, this post offers three paths for navigating your cloud bills after free tiers expire. It covers:

Staying with the same provider. Once you run out of free credits, you can optimize your storage instances and continue using (and paying) for the same provider.
Exploring multi-cloud options. You can port some of your data to another solution and take advantage of the freedom of a multi-cloud strategy.
Choosing another provider. You can transfer all of your data to a different cloud that better suits your needs.

Path 1: Stick With Your Current Cloud Provider

If you’re running out of promotional credits with your current provider, your first path is to just continue using their storage services. Many people see this as your only option because of the frighteningly high egress fees you’d face if you try to leave. If you choose to stay with the same provider, be sure to review and account for all of the instances you’ve spun up.

Here’s an example of a bill that one developer faced after their credits expired: This user found themselves locked into an unexpected $2,700 bill because of egress costs. Looking closer at their experience, the spike in charges was due to a data transfer of 30TB of data. The first 1GB of data transferred out is free, followed by egress costing $0.09 per gigabyte for the first 10TB and $0.085 per gigabyte for the next 40TB. Doing the math, that’s:

$0.085/GB x 20,414 GB = $1735, $0.090/GB x 10,239 GB = $921

Choosing to stay with your current cloud provider is a straightforward path, but it’s not necessarily the easiest or least expensive option, which is why it’s important to conduct a thorough audit of the current cloud services you have in use to optimize your cloud spend.

Optimizing Your Current Cloud Storage Solution

Over time, cloud infrastructure tends to become more complex and varied, and your cloud storage bills follow the same pattern. Cloud pricing transparency in general is an issue with most diversified providers—in short: It’s hard to understand what you’re paying for, and when. If you haven’t seen a comparison yet, a breakdown contrasting storage providers is shared in this post.

Many users find that AWS and Google Cloud are so complex that they turn to services that can help them monitor and optimize their cloud spend. These cost management services charge based on a percentage of your AWS spend. For a startup with limited resources, paying for these professional services can be challenging, but manually predicting cloud costs and optimizing spending is also difficult, as well as time consuming.

The takeaway for sticking with your current provider: Be a budget hawk for every fee you may be at risk of incurring, and ensure your development keeps you from unwittingly racking up heavy fees.

Path 2: Take a Multi-cloud Approach

For some developers, although you may want to switch to a different cloud after your free credits expire, your code can’t be easily separated from your cloud provider. In this case, a multi-cloud approach can achieve the necessary price point while maintaining the required level of service.

Short term, you can mitigate your cloud bill by immediately beginning to port any data you generate going forward to a more affordable solution. Even if the process of migrating your existing data is challenging, this move will stop your current bill from ballooning.

Beyond mitigation, there are multiple benefits to using a multi-cloud solution. A multi-cloud strategy gives companies the freedom to use the best possible cloud service for each workload. There are other benefits to taking a multi-cloud approach:

Redundancy: Some major providers have faced outages recently. A multi-cloud strategy allows you to have a backup of your data to continue serving your customers even if your primary cloud provider goes down.
Functionality: With so many providers introducing new features and services, it’s unlikely that a single cloud provider will meet all of your needs. With a multi-cloud approach, you can pick and choose the best services from each provider. Multinational companies can also optimize for their particular geographical regions.
Flexibility: Avoid vendor lock-in if you outgrow a single cloud provider with a diverse cloud infrastructure.
Cost: You may find that one cloud provider offers a lower price for compute and another for storage. A multi-cloud strategy allows you to pick and choose which works best for your budget.

The takeaway for pursuing multi-cloud: It might not solve your existing bill, but it will mitigate your exposure to additional fees going forward. And it offers the side benefit of providing a best-of-breed approach to your development tech stack.

Path 3: Find a New Cloud Provider

Finally, you can choose to move all of your data to a different cloud storage provider. We recommend taking a long-term approach: Look for cloud storage that allows you to scale with the least amount of friction while continuing to support everything you need for a good customer experience in your app. You’ll want to consider cost, usability, and solutions when looking for a new provider.

Cost

Many cloud providers use a multi-tier approach, which can become complex as your business starts to scale its cloud infrastructure. Switching to a provider that has single-tier pricing helps businesses planning for growth predict their cloud storage cost and optimize its spend, saving time and money for use on future opportunities. You can use this pricing calculator to check storage costs of Backblaze B2 Cloud Storage against AWS, Azure, and Google Cloud.

One example of a startup that saved money and was able to grow their business by switching to another storage provider is CloudSpot, a SaaS photography platform. They had initially gotten their business off the ground with the help of a startup incubator. Then in 2019, their AWS storage costs skyrocketed, but their team felt locked in to using Amazon.

When they looked at other cloud providers and eventually transferred their data out of AWS, they were able to save on storage costs that allowed them to reintroduce services they had previously been forced to shut down due to their AWS bill. Reviving these services made an immediate impact on customer acquisition and recurring revenue.

Usability

Time spent trying to navigate a complicated platform is a significant cost to business. Aiden Korotkin of AK Productions, a full-service video production company based in Washington, D.C., experienced this first hand. Korotkin initially stored his client data in Google Cloud because the platform had offered him a promotional credit. When the credits ran out in about a year, he found himself frustrated with the inefficiency, privacy concerns, and overall complexity of Google Cloud.

Korotkin chose to switch to Backblaze B2 Cloud Storage with the help of solution engineers that helped him figure out the best storage solution for his business. After quickly and seamlessly transferring his first 12TB in less than a day, he noticed a significant difference from using Google Cloud. “If I had to estimate, I was spending between 30 minutes to an hour trying to figure out simple tasks on Google (e.g. setting up a new application key, or syncing to a third-party source). On Backblaze it literally takes me five minutes,” he emphasized.

Integrations

Workflow integrations can make cloud storage easier to use and provide additional features. By selecting multiple best-of-breed providers, you can achieve better functionality with significantly reduced price and complexity.

Content delivery network (CDN) partnerships with Cloudflare and Fastly allow developers using services like Backblaze B2 to take advantage of free egress between the two services. Game developers can serve their games to users without paying egress between their origin source and their CDN, and media management solutions that can integrate directly with cloud storage to make media assets easy to find, sort, and pull into a new project or editing tool. Take a look at other solutions integrated with cloud storage that can support your workflows.

Cloud to Cloud Migration

After choosing a new cloud provider, you can plan your data migration. Your data may be spread out across multiple buckets, service providers, or different storage tiers—so your first task is discovering where your data is and what can and can’t move. Once you’re ready, there is a range of solutions for moving your data, but when it comes to moving between cloud services, a data migration tool like Flexify.IO can help make things a lot easier and faster.

Instead of manually offloading static and production data from your current cloud storage provider and reuploading it into your new provider, Flexify.IO reads the data from the source storage and writes it to the destination storage via inter-cloud bandwidth. Flexify.IO achieves fast and secure data migration at cloud-native speeds because the data transfer happens within the cloud environment.

Supercharged Data Migration with Flexify.IO

For developers with customer-facing applications, it’s especially important that customers still retain access to data during the migration from one cloud provider to another. When CloudSpot moved about 700TB of data from AWS to Backblaze B2 in just six days with help from Flexify.IO, customers were actually still uploading images to their Amazon S3 buckets. The migration process was able to support both environments and allowed them to ensure everything worked properly. It was also necessary because downtime was out of the question—customers access their data so frequently that one of CloudSpot’s galleries is accessed every one or two seconds.

Still deciding whether moving your data to a new cloud provider is right for you? Backblaze will cover data transfer and egress fees for customers migrating over 50TB of data from U.S., Canada, and Europe regions and store it for at least 12 months.

What’s Next?

If you’re interested in exploring a different cloud storage service for your solution, you can easily sign up today, or contact us for more information on how to run a free POC or just to begin transferring your data out of your current cloud provider.

The post Finding a 1Up When Free Cloud Credits Run Out appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Backblaze Hard Drive Stats for 2020

2021-01-26

Post Syndicated from original https://www.backblaze.com/blog/backblaze-hard-drive-stats-for-2020/

In 2020, Backblaze added 39,792 hard drives and as of December 31, 2020 we had 165,530 drives under management. Of that number, there were 3,000 boot drives and 162,530 data drives. We will discuss the boot drives later in this report, but first we’ll focus on the hard drive failure rates for the data drive models in operation in our data centers as of the end of December. In addition, we’ll welcome back Western Digital to the farm and get a look at our nascent 16TB and 18TB drives. Along the way, we’ll share observations and insights on the data presented and as always, we look forward to you doing the same in the comments.

2020 Hard Drive Failure Rates

At the end of 2020, Backblaze was monitoring 162,530 hard drives used to store data. For our evaluation, we remove from consideration 231 drives which were used for testing purposes and those drive models for which we did not have at least 60 drives. This leaves us with 162,299 hard drives in 2020, as listed below.

Observations

The 231 drives not included in the list above were either used for testing or did not have at least 60 drives of the same model at any time during the year. The data for all drives, data drives, boot drives, etc., is available for download on the Hard Drive Test Data webpage.

For drives which have less than 250,000 drive days, any conclusions about drive failure rates are not justified. There is not enough data over the year-long period to reach any conclusions. We present the models with less than 250,000 drive days for completeness only.

For drive models with over 250,000 drive days over the course of 2020, the Seagate 6TB drive (model: ST6000DX000) leads the way with a 0.23% annualized failure rate (AFR). This model was also the oldest, in average age, of all the drives listed. The 6TB Seagate model was followed closely by the perennial contenders from HGST: the 4TB drive (model: HMS5C4040ALE640) at 0.27%, the 4TB drive (model: HMS5C4040BLE640), at 0.27%, the 8TB drive (model: HUH728080ALE600) at 0.29%, and the 12TB drive (model: HUH721212ALE600) at 0.31%.

The AFR for 2020 for all drive models was 0.93%, which was less than half the AFR for 2019. We’ll discuss that later in this report.

What’s New for 2020

We had a goal at the beginning of 2020 to diversify the number of drive models we qualified for use in our data centers. To that end, we qualified nine new drives models during the year, as shown below.

Actually, there were two additional hard drive models which were new to our farm in 2020: the 16TB Seagate drive (model: ST16000NM005G) with 26 drives, and the 16TB Toshiba drive (model: MG08ACA16TA) with 40 drives. Each fell below our 60-drive threshold and were not listed.

Drive Diversity

The goal of qualifying additional drive models proved to be prophetic in 2020, as the effects of Covid-19 began to creep into the world economy in March 2020. By that time we were well on our way towards our goal and while being less of a creative solution than drive farming, drive model diversification was one of the tactics we used to manage our supply chain through the manufacturing and shipping delays prevalent in the first several months of the pandemic.

Western Digital Returns

The last time a Western Digital (WDC) drive model was listed in our report was Q2 2019. There are still three 6TB WDC drives in service and 261 WDC boot drives, but neither are listed in our reports, so no WDC drives—until now. In Q4 a total of 6,002 of these 14TB drives (model: WUH721414ALE6L4) were installed and were operational as of December 31^st.

These drives obviously share their lineage with the HGST drives, but they report their manufacturer as WDC versus HGST. The model numbers are similar with the first three characters changing from HUH to WUH and the last three characters changing from 604, for example, to 6L4. We don’t know the significance of that change, perhaps it is the factory location, a firmware version, or some other designation. If you know, let everyone know in the comments. As with all of the major drive manufacturers, the model number carries patterned information relating to each drive model and is not randomly generated, so the 6L4 string would appear to mean something useful.

WDC is back with a splash, as the AFR for this drive model is just 0.16%—that’s with 6,002 drives installed, but only for 1.7 months on average. Still, with only one failure during that time, they are off to a great start. We are looking forward to seeing how they perform over the coming months.

New Models From Seagate

There are six Seagate drive models that were new to our farm in 2020. Five of these models are listed in the table above and one model had only 26 drives, so it was not listed. These drives ranged in size from 12TB to 18TB and were used for both migration replacements as well as new storage. As a group, they totaled 13,596 drives and amassed 1,783,166 drive days with just 46 failures for an AFR of 0.94%.

Toshiba Delivers More Zeros

The new Toshiba 14TB drive (model: MG07ACA14TA) and the new Toshiba 16TB (model: MG08ACA16TEY) were introduced to our data centers in 2020 and they are putting up zeros, as in zero failures. While each drive model has only been installed for about two months, they are off to a great start.

Comparing Hard Drive Stats for 2018, 2019, and 2020

The chart below compares the AFR for each of the last three years. The data for each year is inclusive of that year only and for the drive models present at the end of each year.

The Annualized Failure Rate for 2020 Is Way Down

The AFR for 2020 dropped below 1% down to 0.93%. In 2019, it stood at 1.89%. That’s over a 50% drop year over year. So why was the 2020 AFR so low? The answer: It was a group effort. To start, the older drives: 4TB, 6TB, 8TB, and 10TB drives as a group were significantly better in 2020, decreasing from a 1.35% AFR in 2019 to a 0.96% AFR in 2020. At the other end of the size spectrum, we added over 30,000 larger drives: 14TB, 16TB, and 18TB, which as a group recorded an AFR of 0.89% for 2020. Finally, the 12TB drives as a group had a 2020 AFR of 0.98%. In other words, whether a drive was old or new, or big or small, they performed well in our environment in 2020.

Lifetime Hard Drive Stats

The chart below shows the lifetime annualized failure rates of all of the drives models in production as of December 31, 2020.

AFR and Confidence Intervals

Confidence intervals give you a sense of the usefulness of the corresponding AFR value. A narrow confidence interval range is better than a wider range, with a very wide range meaning the corresponding AFR value is not statistically useful. For example, the confidence interval for the 18TB Seagate drives (model: ST18000NM000J) ranges from 1.5% to 45.8%. This is very wide and one should conclude that the corresponding 12.54% AFR is not a true measure of the failure rate of this drive model. More data is needed. On the other hand, when we look at the 14TB Toshiba drive (model: MG07ACA14TA), the range is from 0.7% to 1.1% which is fairly narrow, and our confidence in the 0.9% AFR is much more reasonable.

3,000 Boot Drives

We always exclude boot drives from our reports as their function is very different from a data drive. While it may not seem obvious, having 3,000 boot drives is a bit of a milestone. It means we have 3,000 Backblaze Storage Pods in operation as of December 31st. All of these Storage Pods are organized into Backblaze Vaults of 20 Storage Pods each or 150 Backblaze Vaults.

Over the last year or so, we moved from using hard drives to SSDs as boot drives. We have a little over 1,200 SSDs acting as boot drives today. We are validating the SMART and failure data we are collecting on these SSD boot drives. We’ll keep you posted if we have anything worth publishing.

Are you interested in learning more about the trends in the 2020 drive stats? Join our upcoming webinar: “Backblaze Hard Drive Report: 2020 Year in Review Q&A” with drive stats author, Andy Klein, on February 3.

The Hard Drive Stats Data

The complete data set used to create the information used in this review is available on our Hard Drive Test Data page. You can download and use this data for free for your own purpose. All we ask are three things: 1) you cite Backblaze as the source if you use the data, 2) you accept that you are solely responsible for how you use the data, and 3) you do not sell this data to anyone; it is free.

If you just want the summarized data used to create the tables and charts in this blog post you can download the ZIP file containing the CSV files for each chart.

Good luck and let us know if you find anything interesting.

The post Backblaze Hard Drive Stats for 2020 appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Q&A: Developing for the Data Transfer Project at Facebook

2021-01-21 Jeremy Milk

Post Syndicated from Jeremy Milk original https://www.backblaze.com/blog/qa-developing-for-the-data-transfer-project-at-facebook/

Facebook pointing at Backblaze Cloud

In October of 2020, we announced that Facebook integrated Backblaze B2 Cloud Storage as a data transfer destination for their users’ photos and videos. This secure, encrypted service, based on code that Facebook developed with the open-source Data Transfer Project, allows users choices for how and where they manage or archive their media.

We spoke with Umar Mustafa, the Facebook Partner Engineer who led the project, about his team’s role in the Data Transfer Project (DTP) and the development process in configuring the data portability feature for Backblaze B2 Cloud Storage using open-source code. Read on to learn about the challenges of developing data portability including security and privacy practices, coding with APIs, and the technical design of the project.

Q: Can you tell us about the origin of Facebook’s data portability project?

A: Over a decade ago, Facebook launched a portability tool that allowed people to download their information. Since then, we have been adding functionality for people to have more control over their data.

In 2018, we joined the Data Transfer Project (DTP), which is an open-source effort by various companies, like Google, Microsoft, Twitter, and Apple, that aims to build products to allow people to easily transfer a copy of their data between services. The DTP tackles common problems like security, bandwidth limitations, and just the sheer inconvenience when it comes to moving large amounts of data.

And so in connection with this project, we launched a tool in 2019 that lets people port their photos and videos. Google was the first destination and we have partnered with more companies since then, with Backblaze being the most recent one.

Q: As you worked on this tool, did you have a sense for the type of Facebook customer that chooses to copy or transfer their photos and videos over to cloud storage?

A: Yes, we thought of various ways that people could use the tool. Someone might want to try out a new app that manages photos or they might want to archive all the photos and videos they’ve posted over the years in a private cloud storage service.

Q: Would you walk us through the choice to develop it using the open-source DTP code?

A: In order to transfer data between two services, you’d typically use the API from the first service to read data, then transform it if necessary for the second service, and finally use the API from the second service to upload it. While this approach works, you can imagine that it requires a lot of effort every time you need to add a new source or destination. And an API change by any one service would force all its collaborators to make updates.

The DTP solves these problems by offering an open-source data portability platform. It consists of standard data models and a set of service adapters. Companies can create their import and export adapters, or for services with a public API, anyone can contribute the adapters to the project. As long as two services have adapters available for a specific data type (e.g. photos), that data can be transferred between them.

Being open-source also means anyone can try it out. It can be run locally using Docker, and can also be deployed easily in enterprise or cloud-based environments. At Facebook, we have a team that contributes to the project, and we encourage more people from the open-source community to join the effort. More details can be found about the project on GitHub.

Integrating a new service as a destination or a source for an existing data type normally requires adding two types of extensions, an auth extension and a transfer extension. The open-source code is well organized, so you can find all available auth extensions under the extensions/auth module and all transfer extensions under the extensions/data-transfer module, which you can refer to for guidance.

The auth extension only needs to be written once for a service and can be reused for each different data type that the service supports. Some common auth extensions, like OAuth, are already available in the project’s libraries folder and can be extended with very minimal code (mostly config). Alternatively, you can add your own auth extension as long as it implements the AuthServiceExtension interface.

A transfer extension consists of import adapters and export adapters for a service, and each of them is for a single data type. You’ll find them organized by service and data type in the extensions/data-transfer module. In order to add one, you’ll have to add a similar package structure, and write your adapter by implementing the Importer<a extends AuthData, T extends DataModel> interface using the respective AuthData and DataModel classes for the adapter.

For example, in Backblaze we created two import adapters, one for photos and one for videos. Each of them uses the TokenSecretAuthData containing the application key and secret. The photos importer uses the PhotosContainerResource as the DataModel and the videos importer uses the VideosContainerResource. Once you have the boilerplate code in place for the importer or exporter, you have to implement the required methods from the interface to get it working, using any relevant SDKs as you need. As Backblaze offers the Backblaze S3 Compatible APIs, we were able to use the AWS S3 SDK to implement the Backblaze adapters.

There’s a well written integration guide for the project on GitHub that you can follow for further details about integrating with a new service or data type.

Q: Why did you choose Backblaze as a storage endpoint?

A: We want people to be able to choose where they want to take their data. Backblaze B2 is a cloud storage of choice for many people and offers Backblaze S3 Compatible APIs for easy integration. We’re happy to see people using Backblaze to save a copy of their photos and videos.

Q: Can you tell us about the comprehensive security and compliance review you conducted before locking in on Backblaze?

A: Privacy and security is of utmost importance for us at Facebook. When engaging with any partner, we check that they comply with certain standards. Some of the things that help us evaluate a partner include:

Information security policies.
Privacy policies.
Third-party security certifications, as available.

We followed a similar approach to review the security and privacy practices that Backblaze follows, which are also demonstrated by various industry standard certifications.

Q: Describe the process of coding to Backblaze, anything you particularly enjoyed? Anything you found different or challenging? Anything surprising?

A: The integration for the data itself was easy to build. The Backblaze S3 Compatible APIs make coding the adapters pretty straightforward, and Backblaze has good documentation around that.

The only difference between Backblaze and our other existing destinations was with authentication. Most adapters in the DTP use OAuth for authentication, where users log in to each service before initiating a transfer. Backblaze is different as it uses API keys-based authentication. This meant that we had to extend the UI in our tool to allow users to enter their application key details and wire that up as TokenSecretAuthData to the import adapters to transfer jobs securely.

Q: What interested you in data portability?

A: The concept of data portability sparked my interest once I began working at Facebook. Coincidentally, I had recently wondered if it would be possible to move my photos from one cloud backup service to another, and I was glad to discover a project at Facebook addressing the issue. More importantly, I felt that the problem it solves is important.

Facebook is always looking for new ways to innovate, so it comes with an opportunity to potentially influence how data portability will be commonly used and perceived in the future.

Q: What are the biggest challenges for DTP? It seems to be a pretty active project three years after launch. Given all the focus on it, what is it that keeps the challenge alive? What areas are particularly vexing for the project overall?

One major challenge we’ve faced is around technical design—currently the tool has to be deployed and run independently as a single instance to be able to make transfers. This has its advantages and disadvantages. On one hand, any entity or individual can run the project completely and enable transfers to any of the available services as long as the respective credentials are available. On the other hand, in order to integrate a new service, you need to redeploy all the instances where you need that service.

At the moment, Google has their own instance of the project deployed on their infrastructure, and at Facebook we have done the same, as well. This means that a well-working partnership model is required between services to offer the service to their respective users. As one of the maintainers of the project, we try to make this process as swift and hassle-free as possible for new partners.

With more companies investing time in data portability, we’ve started to see increased improvements over the past few months. I’m sure we’ll see more destinations and data types offered soon.

The post Q&A: Developing for the Data Transfer Project at Facebook appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Development Roadmap: Power Up Apps With Go Programming Language and Cloud Storage

2020-12-15 Skip Levens

Post Syndicated from Skip Levens original https://www.backblaze.com/blog/development-roadmap-power-up-apps-with-go-programming-language-and-cloud-storage/

If you build apps, you’ve probably considered working in Go. After all, the open-source language has become more popular with developers every year since its introduction. With a reputation for simplicity in meeting modern programming needs, it’s no surprise that GitHub lists it as the 10th most popular coding language out there. Docker, Kubernetes, rclone—all developed in Go.

If you’re not using Go, this post will suggest a few reasons you might give it a shot in your next application, with a specific focus on another reason for its popularity: its ease of use in connecting to cloud storage—an increasingly important requirement as data storage and delivery becomes central to wide swaths of app development. With this in mind, the following content will also outline some basic and relatively straightforward steps to follow for building an app in Go and connecting it to cloud storage.

But first, if you’re not at all familiar with this programming language, here’s a little more background to get you started.

What Is Go?

Go (sometimes referred to as Golang) is a modern coding language that can perform as well as low-level languages like C, yet is simpler to program and takes full advantage of modern processors. Similar to Python, it can meet many common programming needs and is extensible with a growing number of libraries. However, these advantages don’t mean it’s necessarily slower—in fact, applications written in Go compile to a binary that runs nearly as fast as programs written in C. It’s also designed to take advantage of multiple cores and concurrency routines, compiles to machine code, and is generally regarded as being faster than Java.

Why Use Go With Cloud Storage?

No matter how fast or efficient your app is, how it interacts with storage is crucial. Every app needs to store content on some level. And even if you keep some of the data your app needs closer to your CPU operations, or on other storage temporarily, it still benefits you to use economical, active storage.

Here are a few of the primary reasons why:

Massive amounts of user data. If your application allows users to upload data or documents, your eventual success will mean that storage requirements for the app will grow exponentially.
Application data. If your app generates data as a part of its operation, such as log files, or needs to store both large data sets and the results of compute runs on that data, connecting directly to cloud storage helps you to manage that flow over the long run.
Large data sets. Any app that needs to make sense of giant pools of unstructured data, like an app utilizing machine learning, will operate faster if the storage for those data sets is close to the application and readily available for retrieval.

Generally speaking, active cloud storage is a key part of delivering ideal OpEx as your app scales. You’re able to ensure that as you grow, and your user or app data grows along with you, your need to invest in storage capacity won’t hamper your scale. You pay for exactly what you use as you use it.

Whether you buy the argument here, or you’re just curious, it’s easy and free to test out adding this power and performance to your next project. Follow along below for a simple approach to get you started, then tell us what you think.

How to Connect an App Written in Go With Cloud Storage

Once you have your Go environment set up, you’re ready to start building code in your main Gopath’s directory ($GOPATH). This example builds a Go app that connects to Backblaze B2 Cloud Storage using the AWS S3 SDK.

Next, create a bucket to store content in. You can create buckets programmatically in your app later, but for now, create a bucket in the Backblaze B2 web interface, and make note of the associated server endpoint.

Now, generate an application key for the tool, scope bucket access to the the new bucket only, and make sure that “Allow listing all bucket names” is selected:

Make note of the bucket server connection and app key details. Use a Go module—for instance, this popular one, called godotenv—to make the configuration available to the app that will look in the app root for a .env (hidden) file.

Create the .env file in the app root with your credentials:

With configuration complete, build a package that connects to Backblaze B2 using the S3 API and S3 Go packages.

First, import the needed modules:

Then create a new client and session that uses those credentials:

And then write functions to upload, download, and delete files:

Now, put it all to work to make sure everything performs.

In the main test app, first import the modules, including godotenv and the functions you wrote:

Read in and reference your configuration:

And now, time to exercise those functions and see files upload and download.

For example, this extraordinarily compact chunk of code is all you need to list, upload, download, and delete objects to and from local folders:

If you haven’t already, run go mod init to initialize the module dependencies, and run the app itself with go run backblaze_example_app.go.

Here, a listResult has been thrown in after each step with comments so that you can follow the progress as the app lists the number of objects in the bucket (in this case, zero), upload your specified file from the dir_upload folder, then download it back down again to dir_download:

Use another tool like rclone to list the bucket contents independently and verify the file was uploaded:

Or, of course, look in the Backblaze B2 web admin:

And finally, looking in the local system’s dir_download folder, see the file you downloaded:

With that—and code at https://github.com/GiantRavens/backblazeS3—you have enough to explore further, connect to Backblaze B2 buckets with the S3 API, list objects, pass in file names to upload, and more.

Get Started With Go and Cloud Storage

With your app written in Go and connected to cloud storage, you’re able to grow at hyperscale. Happy hunting!

If you’ve already built an app with Go and have some feedback for us, we’d love to hear from you in the comments. And if it’s your first time writing in Go, let us know what you’d like to learn more about!

The post Development Roadmap: Power Up Apps With Go Programming Language and Cloud Storage appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Code and Culture: What Happens When They Clash

2020-11-24 Lora Maslenitsyna

Post Syndicated from Lora Maslenitsyna original https://www.backblaze.com/blog/code-and-culture-what-happens-when-they-clash/

Every industry uses its own terminology. Originally, most jargon emerges out of the culture the industry was founded in, but then evolves over time as culture and technology change and grow. This is certainly true in the software industry. From its inception, tech has adopted terms—like hash, cloud, bug, ether, etc.—regardless of their original meanings and used them to describe processes, hardware issues, and even relationships between data architectures. Oftentimes, the cultural associations these terms carry with them are quickly forgotten, but sometimes they remain problematically attached.

In the software industry, the terms “master” and “slave” have been commonly used as a pair to identify a primary database (the “master”) where changes are written, and a replica (the “slave”) that serves as a duplicate to which the changes are propagated. The industry also commonly uses other terms, such as “blacklist” and whitelist,” whose definitions reflect or at least suggest identity-based categorizations, like the social concept of race.

Recently, the Backblaze Engineering team discussed some examples of language in the Backblaze code that carried negative cultural biases that the team, and the broader company, definitely didn’t endorse. Their conversation centered around the idea of changing the terms used to describe branches in our repositories, and we thought it would be interesting for the developers in our audience to hear about that discussion, and the work that came out of it.

Getting Started: An Open Conversation About Software Industry Standard Terms

The Backblaze Engineering team strives to cultivate a collaborative environment, an effort which is reflected in the structure of their weekly team meetings. After announcements, any member of the team is welcome to bring up any topics they want to discuss. As a result, these meetings work as a kind of forum where team members encourage each other to share their thoughts, especially about anything they might want to change related to internal processes or more generally about current events that may be affecting their thinking about their work.

Earlier this year, the team discussed the events that lead to protests in many U.S. cities as well as to new prominence for the Black Lives Matter movement. The conversation brought up a topic that had been discussed briefly before these events, but now had renewed relevance: mindfulness around terms used as a software industry standard that could reflect biases against certain people’s identities.

These conversations among the team did not start with the intention to create specific procedures, but focused on emphasizing awareness of words used within the greater software industry and what they might mean to different members of the community. Eventually, however, the team’s thinking progressed to include different words and concepts the Backblaze Engineering team resolved to adopt moving forward.

working on code on a laptop during an interview

Why Change the Branch Names?

The words “master” and “slave” have long held harmful connotations, and have been used to distance people from each other and to exclude groups of people from access to different areas of society and community. Their accepted use today as synonyms for database dependencies could be seen as an example of systemic racism: racist concepts, words, or practices embedded as “normal” uses within a society or an organization.

The engineers discussed whether the use of “master” and “slave” terminologies reflected an unconscious practice on the team’s part that could be seen as supporting systemic racism. In this case, the question alone forced them to acknowledge that their usage of these terms could be perceived as an endorsement of their historic meanings. Whether intentionally or not, this is something the engineers did not want to do.

The team decided that, beyond being the right thing to do, revising the use of these terms would allow them to reinforce Backblaze’s reputation as an inclusive place to work. Just as they didn’t want to reiterate any historically harmful ideas, they also didn’t want to keep using terms that someone on the team might feel uncomfortable using, or accidentally make potential new hires feel unwelcome on the team. Everything seemed to point them back to a core part of Backblaze’s values: the idea that we “refuse to take history or habit to mean something is ‘right.’” Oftentimes this means challenging stale approaches to engineering issues, but here it meant accepting terminology that is potentially harmful just because it’s “what everyone does.”

Overall, it was one of those choices that made more sense the longer they looked at it. Not only were the uses of “master” and “slave” problematic, they were also harder and less logical to use. The very effort to replace the words revealed that the dependency they described in the context of data architectures could be more accurately characterized using more neutral terms and shorter terms.

The Engineering team discussed a proposal to update the terms at a team meeting. In unanimous agreement, the term “main” was selected to replace “master” because it is a more descriptive title, it requires fewer keystrokes to type, and since it starts with the same letter as “master,” it would be easier to remember after the change. The terms “whitelist” and “blacklist” are also commonly used terms in tech, but the team decided to opt for “allowlist” and “denylist” because they’re more accurate and don’t associate color with value.

Rolling Out the Changes and Challenges in the Process

The practical procedure of changing the names of branches was fairly straightforward: Engineers wrote scripts that automated the process of replacing the terms. The main challenge that the Engineering team experienced was in coordinating the work alongside team members’ other responsibilities. Short of stopping all other projects to focus on renaming the branches, the engineers had to look for a way to work within the constraints of Gitea, the constraints of the technical process of renaming, and also avoid causing any interruptions or inconveniences for the developers.

First, the engineers prepared each repository for renaming by verifying that each one didn’t contain any files that referenced “master” or by updating files that referenced the “master” branch. For example, one script was going to be used for a repository that would update multiple branches at the same time. These changes were merged to a special branch called “master-to-main” instead of the “master” branch itself. That way, when that repository’s “master” branch was renamed, the “master-to-main” branch was merged into “main” as a final step. Since Backblaze has a lot of repositories, and some take longer than others to complete the change, people divided the jobs to help spread out the work.

While the actual procedure did not come with many challenges, writing the scripts required thoughtfulness about each database. For example, in the process of merging changes to the updated “main” branch in Git, it was important to be sure that any open pull requests, where the engineers review and approve changes to the code, were saved. Otherwise, developers would have to recreate them, and could lose history of their work, changes, and other important comments from projects unrelated to the renaming effort. While writing the script to automate the name change, engineers were careful to preserve any existing or new pull requests that might have been created at the same time.

Once they finished prepping the repositories, the team agreed on a period of downtime—evenings after work—to go through each repository and rename its “master” branch using the script they had previously written. Afterwards, each person had to run another short script to pick up the change and remove dangling references to the “master” branch.

Managers also encouraged members of the Engineering team to set aside some time throughout the week to prep the repositories and finish the naming changes. Team members also divided and shared the work, and helped each other by pointing out any areas of additional consideration.

Moving Forward: Open Communication and Collaboration

In September, the Engineering team completed renaming the source control branch from “master” to “main.” It was truly a team effort that required unanimous support and time outside of regular work responsibilities to complete the change. Members of the Engineering team reflected that the project highlighted the value of having a diverse team where each person brings a different perspective to solving problems and new ideas.

Earlier this year, some of the people on the Engineering team also became members of the employee-led Diversity, Equity, and Inclusion Committee. Along with Engineering, other teams are having open discussions about diversity and how to keep cultivating inclusionary practices throughout the organization. The full team at Backblaze understands that these changes might be small in the grand scheme of things, but we’re hopeful our intentional approach to those issues we can address will encourage other business and individuals to look into what’s possible for them.

The post Code and Culture: What Happens When They Clash appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Development Simplified: CORS Support for Backblaze S3 Compatible APIs

2020-11-19 Amrit Singh

Post Syndicated from Amrit Singh original https://www.backblaze.com/blog/development-simplified-cors-support-for-backblaze-s3-compatible-apis/

Since its inception in 2009, Cross-Origin Resource Sharing (CORS) has offered developers a convenient way of bypassing an inherently secure default setting—namely the same-origin policy (SOP). Allowing selective cross-origin requests via CORS has saved developers countless hours and money by reducing maintenance costs and code complexity. And now with CORS support for Backblaze’s recently launched S3 Compatible APIs, developers can continue to scale their experience without needing a complete code overhaul.

If you haven’t been able to adopt Backblaze B2 Cloud Storage in your development environment because of issues related to CORS, we hope this latest release gives you an excuse to try it out. Whether you are using our B2 Native APIs or S3 Compatible APIs, CORS support allows you to build rich client-side web applications with Backblaze B2. With the simplicity and affordability this service offers, you can put your time and money back to work on what’s really important: serving end users.

Top Three Reasons to Enable CORS

B2 Cloud Storage is popular among agile teams and developers who want to take advantage of easy to use and affordable cloud storage while continuing to seamlessly support their applications and workflows with minimal to no code changes. With Backblaze S3 Compatible APIs, pointing to Backblaze B2 for storage is dead simple. But if CORS is key to your workflow, there are three additional compelling reasons for you to test it out today:

Compatible storage with no re-coding. By enabling CORS rules for your custom web application or SaaS service that uses our S3 Compatible APIs, your development team can serve and upload data via B2 Cloud Storage without any additional coding or reconfiguring required. This will save you valuable development time as you continue to deliver a robust experience for your end users.
Seamless integration with your plugins. Even if you don’t choose B2 Cloud Storage as the primary backend for your business but you do use it for discreet plugins or content serving sites, enabling CORS rules for those applications will come in handy. Developers who configure PHP, NodeJS, and WordPress plugins via the S3 Compatible APIs to upload or download files from web applications can do so easily by enabling CORS rules in their Backblaze B2 Buckets. With CORS support enabled, these plugins work seamlessly.
Serving your web assets with ease. Consider an even simpler scenario in which you want to serve a custom web font from your B2 Cloud Storage Bucket. Most modern browsers will require a preflight check for loading the font. By configuring the CORS rules in that bucket to allow the font to be served in the origin(s) of your choice, you will be able to use your custom font seamlessly across your domains from a single source.

Whether you are relying on B2 Cloud Storage as your primary cloud infrastructure for your web application or simply using it to serve cross-origin assets such as fonts or images, enabling CORS rules in your buckets will allow for proper and secure resource sharing.

Enabling CORS Made Simple and Fast

If your web page or application is hosted in a different origin from images, fonts, videos, or stylesheets stored in B2 Cloud Storage, you need to add CORS rules to your bucket to achieve proper functionality. Thankfully, enabling CORS rules is easy and can be found in your B2 Cloud Storage settings:

You will have the option of sharing everything in your bucket with every origin, select origins, or defining custom rules with the Backblaze B2 CLI.

Learning More and Getting Started

If you’re dying to learn more about the fundamentals of CORS as well as additional specifics about how it works with B2 Cloud Storage, you can dig into this informative Knowledge Base article. If you’re just pumped that CORS is now easily available in our S3 Compatible APIs suite, well then, you’re probably already on your way to a smoother, more reasonably priced development experience. If you’ve got a question or a response, we always love to hear from you in the comments or you can contact us for assistance.

The post Development Simplified: CORS Support for Backblaze S3 Compatible APIs appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Solution Roadmap: Cross-site Collaboration With the Synology NAS Toolset

2020-11-17 Janet Lafleur

Post Syndicated from Janet Lafleur original https://www.backblaze.com/blog/solution-roadmap-cross-site-collaboration-with-the-synology-nas-toolset/

Most teams use their Synology NAS device primarily as a common space to store active data. It’s helpful for collaboration and cuts down on the amount of storage you need to buy for each employee in a media workflow. But if your teams are geographically dispersed, a NAS device at each location will also allow you to sync specific folders across offices and protect the data in them with more reliable and non-duplicative workflows. By setting up an integrated cloud storage tier and using Synology Drive ShareSync, Cloud Sync, and Hyper Backup—all free tools that come with the purchase of your NAS device—you can improve your collaboration capabilities further, and simplify and strengthen data protection for your NAS.

Drive ShareSync: Synchronizes folders and files across linked NAS devices.
Cloud Sync: Copies files to cloud storage automatically as they’re created or changed.
Hyper Backup: Backs up file and systems data to local or cloud storage.

Taken together, these tools, paired with a reasonable and reliable cloud storage, will grow your remote collaboration capacity while better protecting your data. Properly architected, they can make sharing and protecting large files easy, efficient, and secure for internal production, while also making it all look effortless for external clients’ approval and final delivery.

We’ll break out how it all works in the sections below. If you have questions, please reach out in the comments, or contact us.

If you’re more of a visual learner, our Cloud University series also offers an on-demand webinar featuring a demo laboratory showing how to set up cross-office collaboration on a Synology NAS. Otherwise, read on.

In a multi-site file exchange configuration, Synology NAS devices are synced between offices, while cloud storage provides an archive and backup storage target for Synology **Cloud Sync** and **Hyper Backup**.

Synchronizing Two or More NAS Devices With Synology Drive ShareSync

Moving media files to a NAS is a great first step towards easier sharing and ensuring that everyone on the team is working on the correct version of any given project. But taking an additional step to also sync folders across multiple NAS devices guarantees that each file is only transferred between sites once, instead of every time a team member accesses the file. This is also a way to reduce network traffic and share large media files that would otherwise require more time and resources.

With Synology Drive ShareSync, you can also choose which specific folders to sync, like folders with corporate brand images or folders for projects which team members across different offices are working on. You also have the option between a one-way and two-way sync, and Synology Drive ShareSync automatically filters out temporary files so that they’re not replicated from primary to secondary.

With Synology **Drive ShareSync**, specific folders on NAS devices can be synced in a two-way or one-way fashion.

Backing Up and Archiving Media Files With Synology Cloud Sync and Cloud Storage

With Cloud Sync, another tool included with your Synology NAS, you can make a copy of your media files to a cloud storage bucket as soon as they are ingested into the NAS. For creative agencies and corporate video groups that work with high volumes of video and images, syncing data to the cloud on ingest protects the data while it’s active and sets up an easy way to archive it once the project is complete. Here’s how it works:

1. After a multiple day video or photo shoot, upload the source media files to your Synology NAS. When new media files are found on the NAS, Synology Cloud Sync makes a copy of them to cloud storage.

2. While the team works on the project, the copies of the media files in the cloud storage bucket serve as a backup in case a file is accidentally deleted or corrupted on the NAS.

3. Once the team completes the project, you can switch off Synology Cloud Sync for just that folder, then delete the raw footage files from the NAS. This allows you to free up storage space for a new project.

4. The video and photo files remain in the bucket for the long term, serving as archive copies for future use or when a client returns for another project.

You can configure Synology **Cloud Sync** to watch folders for new files in specific time periods and control the upload speed to prevent saturating your internet connection.

Using Cloud Sync for Content Review With External Clients

Cloud Sync can also be used to simplify and speed up the editorial review process with clients. Emailing media files like videos and high-res images to external approvers is generally not feasible due to size, and setting up and maintaining FTP servers can be time consuming for you and complicated or confusing for your clients. It’s not an elegant way to put your best creative work in front of them. To simplify the process, create content review folders for each client, generate a link to a ZIP file in a bucket, and share the link with them via email.

Protecting Your NAS Data With Synology Hyper Backup and Backblaze B2

Last, but not least, Synology Hyper Backup can also be configured to do weekly full backups and daily incremental backups of all your NAS data to your cloud storage bucket. Disks can crash and valuable files can be deleted or corrupted, so ensuring you have complete data protection is an essential step in your storage infrastructure.

Hyper Backup will allow you to back up files, folders, and other settings to another destination (like cloud storage) according to a schedule. It also offers flexible retention settings, which allow you to restore an entire shared folder from different points in time. You can learn about how to set it up using this Knowledge Base article.

With Hyper Backup, you gain more control over setting up and managing weekly and daily backups to cloud storage. You can:

Encrypt files before transferring them, so that your data will be stored as encrypted files.
Choose to only encrypt files during the transfer process.
Enable an integrity check to confirm that files were backed up correctly and can be successfully restored.
Set integrity checks to run at specific frequencies and times.

Human error is often the inspiration to reach for a backup, but ransomware attacks are on the rise, and a strategy of recycle and rotation practices alongside file encryption helps backups remain unreachable by a ransomware infection. Hyper Backup allows for targeted backup approaches, like saving hourly versions from the previous 24 hours of work, daily versions from the previous month of work, and weekly versions from older than one month. You choose what makes the most sense for your work. You can also set a maximum number of versions if there’s a certain cap you don’t want to exceed. Not only do these smart recycle and rotation practices manage your backups to help protect your organization against ransomware, but they can also reduce storage costs.

**Hyper Backup** allows you to precisely configure which folders to back up. In this example, raw video footage is excluded because a copy was made by **Cloud Sync** on upload with the archive-on-ingest strategy.

Set Up Multi-site File Exchange With Synology NAS and Cloud Storage

To learn more about how you can set up your Synology NAS with cloud storage to implement a collaboration and data protection solution like this, one of our solutions engineers recently crafted a guide outlining how to do so with our cloud storage solution.

At the end of the day, collaboration is the soul of much creative work, and orienting your system to make the nuts and bolts of collaboration invisible to the creatives themselves, while ensuring all their content is fully protected, will set your team up for the greatest success. Synology NAS, its impressive built-in software suite, and cloud storage can help you get there.

The post Solution Roadmap: Cross-site Collaboration With the Synology NAS Toolset appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Vanguard Perspectives: Microsoft 365 to Veeam Backup to Backblaze B2 Cloud Storage

2020-11-12 Natasha Rabinov

Post Syndicated from Natasha Rabinov original https://www.backblaze.com/blog/vanguard-perspectives-microsoft-365-to-veeam-backup-to-backblaze-b2-cloud-storage/

Ben Young works for vBridge, a cloud service provider in New Zealand. He specializes in the automation and integration of a broad range of cloud & virtualization technologies. Ben is also a member of the Veeam® Vanguard program, Veeam’s top-level influencer community. (He is not an employee of Veeam). Because Backblaze’s new S3 Compatible APIs enable Backblaze B2 Cloud Storage as an endpoint in the Veeam ecosystem, we reached out to Ben, in his role as a Veeam Vanguard, to break down some common use cases for us. If you’re working with Veeam and Microsoft 365, this post from Ben could help save you some time and headaches.

—Natasha Rabinov, Backblaze

Backing Up Microsoft Office 365 via Veeam in Backblaze B2 Cloud Storage

Veeam Backup for Microsoft Office 365 v4 included a number of enhancements, one of which was the support for object-based repositories. This is a common trend for new Veeam product releases. The flagship Veeam Backup & Replication product now supports a growing number of object enabled capabilities.

So, why object storage over block-based repositories? There are a number of reasons but scalability is, I believe, the biggest. These platforms are designed to handle petabytes of data with very good durability, and object storage is better suited to that task.

With the data scalability sorted, you only need to worry about monitoring and scaling out the compute workload of the proxy servers (worker nodes). Did I mention you no longer need to juggle data moves between repositories?! These enhancements create a number of opportunities to simplify your workflows.

So naturally, with the recent announcement from Backblaze saying they now have S3 Compatible API support, I wanted to try it out with Veeam Backup for Microsoft Office 365.
Let’s get started. You will need:

A Backblaze B2 account: You can create one here for free. The first 10GB are complimentary so you can give this a go without even entering a credit card.
A Veeam Backup for Microsoft Office 365 environment setup: You can also get this for free (up to 10 users) with their Community Edition.
An organization connected to the Veeam Backup for Microsoft Office 365 environment: View the options and how-to guide here.

Configuring Your B2 Cloud Storage Bucket

In the Backblaze B2 console, you need to create a bucket. If you already have one, you may notice that there is a blank entry next to “endpoint.” This is because buckets created before May 4, 2020 cannot be used with the Backblaze S3 Compatible APIs.

So, let’s create a new bucket. I used “VeeamBackupO365.”

This bucket will now appear with an S3 endpoint, which we will need for use in Veeam Backup for Microsoft Office 365.

Before you can use the new bucket, you’ll need to create some application keys/credentials. Head into the App Keys settings in Backblaze and select “create new.” Fill out your desired settings and, as good practice, make sure you only give access to this bucket, or the buckets you want to be accessible.

Your application key(s) will now appear. Make sure to save these keys somewhere secure, such as a password manager, as they only will appear once. You should also keep them accessible now as you are going to need them shortly.

The Backblaze setup is now done.

Configuring Your Veeam Backup

Now you’ll need to head over to your Veeam Backup for Microsoft Office 365 Console.

Note: You could also achieve all of this via PowerShell or the RESTful API included with this product if you wanted to automate.

It is time to create a new backup repository in Veeam. Click into your Backup Infrastructure panel and add a new backup repository and give it a name…

…Then select the “S3 Compatible” option:

Enter the S3 endpoint you generated earlier in the Backblaze console into the Service endpoint on the Veeam wizard. This will be something along the lines of: s3.*.backblazeb2.com.
Now select “Add Credential,” and enter the App Key ID and Secret that you generated as part of the Backblaze setup.

With your new credentials selected, hit “Next.” Your bucket(s) will now show up. Select your desired backup bucket—in this case I’m selecting the one I created earlier: “VeeamBackupO365.” Now you need to browse for a folder which Veeam will use as its root folder to base the backups from. If this is a new bucket, you will need to create one via the Veeam console like I did below, called “Data.”

If you are curious, you can take a quick look back in your Backblaze account, after hitting “Next,” to confirm that Veeam has created the folder you entered, plus some additional parent folders, as you can see in the example below:

Now you can select your desired retention. Remember, all jobs targeting this repository will use this retention setting, so if you need a different retention for, say, Exchange and OneDrive, you will need two different repositories and you will need to target each job appropriately.

Once you’ve selected your retention, the repository is ready for use and can be used for backup jobs.

Now you can create a new backup job. For this demo, I am going to only back up my user account. The target will be our new repository backed by Backblaze S3 Compatible storage. The wizard walks users through this process.

Select your entire organization or desired users/groups and what to process (Exchange, OneDrive, and/or Sharepoint).

Select the object-backed backblazeb2-s3 backup repository you created.

That is it! Right click and run the job—you can see it starting to process your organization.
As this is the first job you’ve run, it may take some time and you might notice it slowing down. This slow down is a result of the Microsoft data being pulled out of O365. But Veeam is smart enough to have added in some clever user-hopping, so as it detects throttling it will jump across and start a new user, and then loop back to the others to ensure your jobs finish as quickly as possible.

While this is running, if you open up Backblaze again you will see the usage starting to show.

Done and Done

And there it is—a fully functional backup of your Microsoft Office 365 tenancy using Veeam Backup for Microsoft Office 365 and Backblaze B2 Cloud Storage.

We really appreciate Ben’s guide and hope it helps you try out Backblaze as a repository for your Veeam data. If you do—or if you’ve already set us as a storage target—we’d love to hear how it goes in the comments.

You can reach out to Ben at @benyoungnz on Twitter, or his blog, https://benyoung.blog.

The post Vanguard Perspectives: Microsoft 365 to Veeam Backup to Backblaze B2 Cloud Storage appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Gladstone Institutes Builds a Backblaze Fireball, XXXL Edition

2020-11-10

Post Syndicated from original https://www.backblaze.com/blog/gladstone-institutes-builds-a-backblaze-fireball-xxxl-edition/

Here at Backblaze, we’ve been known to do things a bit differently. From Storage Pods and Backblaze Vaults to drive farming and hard drive stats, we often take a different path. So, it’s no surprise we love stories about people who think outside of the box when presented with a challenge. This is especially true when that story involves building a mongo storage server, a venerable Toyota 4Runner, and a couple of IT engineers hell-bent on getting 1.2 petabytes of their organization’s data off-site. Let’s meet Alex Acosta and Andrew Davis of Gladstone Institutes.

Data on the Run

The security guard at the front desk nodded knowingly as Alex and Andrew rolled the three large Turtle cases through the lobby and out the front door of Gladstone Institutes. Well known and widely respected, the two IT engineers comprised two-thirds of the IT Operations staff at the time and had 25 years of Gladstone experience between them. So as odd as it might seem to have IT personnel leaving a secure facility after-hours with three large cases, everything was on the up-and-up.

It was dusk in mid-February. Alex and Andrew braced for the cold as they stepped out into the nearly empty parking lot toting the precious cargo within those three cases. Andrew’s 4Runner was close, having arrived early that day—the big day, moving day. They gingerly lugged the heavy cases into the 4Runner. Most of the weight was the cases themselves, the rest of one was a 4U storage server, and in the other two, 36 hard drives. An insignificant part of the weight, if any at all, was the reason they were doing all of this—200 terabytes of Gladstone Institutes research data.

They secured the cases, slammed the tailgate shut, climbed into the 4Runner, and put the wheels in motion for the next part of their plan. They eased onto Highway 101 and headed south. Traffic was terrible, even the carpool lane; dinner would be late, like so many dinners before.

Back to the Beginning

There had been many other late nights since they started on this project six months before. The Fireball XXXL project, as Alex and Andrew eventually named it, was driven by their mission to safeguard Gladstone’s biomedical research data from imminent disaster. On an unknown day in mid-summer, Alex and Andrew were in the server room at Gladstone surrounded by over 900 tapes that were posing as a backup system.

Andrew mused, “It could be ransomware, the building catches on fire, somebody accidentally deletes the datasets because of a command-line entry, any number of things could happen that would destroy all this.” Alex, as he waved his hand across the ever expanding tape library, added, “We can’t rely on this anymore. Tapes are cumbersome, messy and they go bad even when you do everything right. We waste so much time just troubleshooting things that in 2020 we shouldn’t be troubleshooting anymore.” They resolved to find a better way to get their data off-site.

Reality Check

Alex and Andrew listed the goals for their project: get the 1.2 petabytes of data currently stored on-site and in their tape library safely off-site, be able to add 10–20 terabytes of new data each day, and be able to delete files as they needed along the way. The fact that practically every byte of data in question represented biomedical disease research—including data with direct applicability to fighting a global pandemic—meant that they needed to accomplish all of the above with minimal downtime and maximum reliability. Oh, and they had to do all of this without increasing their budget. Optimists.

With cloud storage as the most promising option, they first considered building their own private cloud in the distant data center in the desert. They quickly dismissed the idea as the upfront costs were staggering, never mind the ongoing personnel and maintenance costs of managing their distant systems.

They decided the best option was using a cloud storage service and they compared the leading vendors. Alex was familiar with Backblaze, having followed the blog for years, especially the posts on drive stats and Storage Pods. Even better, the Backblaze B2 Cloud Storage service was straight-forward and affordable. Something he couldn’t say about the other leading cloud storage vendors.

The next challenge was bandwidth. You might think having a 5 Gb/s connection would be enough, but they had a research-heavy, data-hungry organization using that connection. They sharpened their bandwidth pencils and, taking into account institutional usage, they calculated they could easily support the 10–20 terabytes per day uploads. Trouble was, getting the existing 1.2 petabytes of data uploaded would be another matter entirely. They contacted their bandwidth provider and were told they could double their current bandwidth to 10 Gb/s for a multi-year agreement at nearly twice the cost and, by the way, it would be several months to a year before they could start work. Ouch.

They turned to Backblaze, who offered their Backblaze Fireball data transfer service which could upload about 70 terabytes per trip. “Even with the Fireball, it will take us 15, maybe 20, round trips,” lamented Andrew during another late night session of watching backup tapes. “I wish they had a bigger box,” said Alex, to which Andrew replied, “Maybe we could build one.”

The plan was born: build a mongo storage server, load it with data, take it to Backblaze.

Photo Credit: Gladstone Institutes.

Andrew Davis in Gladstone’s server room.

The Ask

Before they showed up at a Backblaze data center with their creation, they figured they should ask Backblaze first. Alex noted, “With most companies if you say, ‘Hey, I want to build a massive file server, shuttle it into your data center, and plug it in. Don’t you trust me?’ They would say, ‘No,’ and hang up, but Backblaze didn’t, they listened.”

After much consideration, Backblaze agreed to enable Gladstone personnel to enter a nearby data center that was a peering point for the Backblaze network. Thrilled to find kindred spirits, Alex and Andrew now had a partner in the Fireball XXXL project. While this collaboration was a unique opportunity for both parties, for Andrew and Alex it would also mean more late nights and microwaved burritos. That didn’t matter now, they felt like they had a great chance to make their project work.

The Build

Alex and Andrew had squirreled away some budget for a seemingly unrelated project: to build an in-house storage server to serve as a warm backup system for currently active lab projects. That way if anything went wrong in a lab, they could retrieve the last saved version of the data as needed. Using those funds, they realized they could build something to be used as their supersized Fireball XXXL, and then once the data transfer cycles were finished, they could repurpose the system to be the backup server they had budgeted.

Inspired by Backblaze’s open-source Storage Pod, they worked with Backblaze on the specifications for their Fireball XXXL. They went the custom build route starting with a 4U chassis and big drives, and then they added some beefy components of their own.

Fireball XXXL

Chassis: 4U Supermicro 36-bay, 3.5 in disc chassis, built by iXsystems.
Processor: Dual CPU Intel Xeon Gold 5217.
RAM: 4 x 32GB (128GB).
Data Drives: 36 14TB HE14 from Western Digital.
ZIL: 120GB NVMe SSD.
L2ARC: 512GB SSD.

They basically built a 36-bay, 200 terabyte RAID 1+0 system to do the data replication using rclone. Andrew noted, “Rclone is resource-heavy, both on RAM and CPU cycles. When we spec’d the system we needed to make sure we had enough muscle so rclone could push data at 10 Gb/s. It’s not just reading off the drives; it’s the processing required to do that.”

Loading Up

Gladstone runs TrueNAS on their on-premise production systems so it made sense to use it on their newly built data transfer server. “We were able to do a ZFS send from our in-house servers to what looked like a gigantic external hard drive, for lack of a better description,” Andrew said. “It allowed us to replicate at the block level, compressed, so it was much higher performance in copying data over to that system.”

Andrew and Alex had previously determined that they would start with the four datasets that were larger than 40 terabytes each. Each dataset represented years of research from their respective labs, placing them at the top of the off-site backup queue. Over the course of 10 days, they loaded the Fireball XXXL with the data. Once finished, they shut the system down and removed the drives. Opening the foam lined Turtle cases they had previously purchased, they gingerly placed the chassis into one case and the 36 drives in the other two. They secured the covers and headed towards the Gladstone lobby.

At the Data Center

Alex and Andrew eventually arrived at the data center where they’d find the needed Backblaze network peering point. Upon entry, inspections ensued and even though Backblaze had vouched for the Gladstone chaps, the process to enter was arduous. As it should be. Once in their assigned room, they connected a few cables, typed in a few terminal commands and data started uploading to their Backblaze B2 account. The Fireball XXXL performed as expected, with a sustained transfer rate of between eight and 10 Gb/s. It took a little over three days to upload all the data.

They would make another trip a few weeks later and have planned two more. With each trip, more Gladstone data is safely stored off-site.

Gladstone Institutes, with over 40 years of history behind them and more than 450 staff, is a world leader in the biomedical research fields of cardiovascular and neurological diseases, genomic immunology, and virology, with some labs recently shifting their focus to SARS-CoV-2, the virus that causes COVID-19. The researchers at Gladstone rely on their IT team to protect and secure their life-saving research.

Photo Credit: Gladstone Institutes.

When data is literally life-saving, backing up is that much more important.

Epilogue

Before you load up your 200 terabyte media server into the back of your SUV or pickup and head for a Backblaze data center—stop. While we admire the resourcefulness of Andrew and Alex, on our side the process was tough. The security procedures, associated paperwork, and time needed to get our Gladstone heroes access to the data center and our network with their Fireball XXXL were “substantial.” Still, we are glad we did it. We learned a tremendous amount during the process, and maybe we’ll offer our own Fireball XXXL at some point. If we do, we know where to find a couple of guys who know how to design one kick-butt system. Thanks for the ride, gents.

The post Gladstone Institutes Builds a Backblaze Fireball, XXXL Edition appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Supporting Efficient Cloud Workflows at THEMA

2020-11-05 Steve Ferris

Post Syndicated from Steve Ferris original https://www.backblaze.com/blog/supporting-efficient-cloud-workflows-at-thema/

Editor’s Note: As demand for localized entertainment grows rapidly around the globe, the amount of media that production companies handle has skyrocketed at the same time as the production process has become endlessly diverse. In a recent blog post, iconik highlighted one business that uses Backblaze B2 Cloud Storage and the iconik asset management platform together to develop a cloud-based, resource-efficient workflow perfectly suited to their unique needs. Read on for some key learnings from their case study, which we’ve adapted for our blog.

Celebrating Culture With Content

THEMA is a Canal+ Group company that has more than 180 television channels in its portfolio. It helps with the development of these channels and has built strong partnerships with major
pay-TV platforms worldwide.

THEMA started with a focus on ethnic, localized entertainment, and has grown that niche into the foundation of a large, expansive service. Today, THEMA has a presence in nearly every region of the world, where viewers can enjoy programming that celebrates their heritage and offers a taste of home wherever they are.

Cédric Pierre-Louis, Director of Programming for the African Fiction Channels at THEMA, and Gareth Howells, Director of Out Point Media—which was created to assist THEMA quality control and content operations, mainly for its African channels—faced a problem shared by many media organizations: As demand for their content rose, so did the amount of media they were handling. To the extent that their systems were not able to scale with their growth.

A Familiar Challenge

Early on, most media asset management solutions that the African Fiction Channels at THEMA considered for dealing with their expanding content needs had a high barrier to entry, requiring large upfront investments. To stay cost-efficient, THEMA used more manual solutions, but this would eventually prove to be an unsustainable path.

As THEMA moved into creating and managing channels, the increase of content and the added complexity of their workflows brought the need for media management front and center.

Charting a Course for Better Workflows

When Cédric took on leadership of his department at THEMA, he and Gareth both shared a strong desire to make their workflows more agile and efficient. They began by evaluating solutions using a few key points.

Cloud-Based
To start, THEMA needed a solution that could improve how they work across all their global teams. The operation needed to work from anywhere, supporting team members working in Paris and London, as well as content production teams in Nigeria, Ghana, and Ivory Coast.

Minimal Cloud Resources
There was also a unique challenge to overcome with connectivity and bandwidth restrictions facing the distributed teams. They needed a light solution requiring minimal cloud resources. Teams with limited internet access would also need immediate access to the content when they were online.

Proxy Workflows
They also couldn’t afford to continue working with large files. Previously, teams had to upload hi-res, full master versions of media, which then had to be downloaded by every editor who worked on the project. They needed proxy workflows to allow creation to happen faster with much smaller files.

Adobe Integration
The team needed to be able to find content fast and have the ability to simply drag it into their timelines from a panel within their Adobe programs. This ability to self serve and find media without any bottlenecks would have a great impact on production speed.

Affordable Startup Costs
They also needed to stay within a budget. There could not be any costly installation of new infrastructure.

Landing at iconik

While Cédric was searching for the right solution, he took a trip to Stockholm, where he met with iconik’s CEO, Parham Azimi. After a short talk and demo, it was clear that iconik satisfied all of the evaluation points they were looking for in one solution. Soon after that meeting, Cédric and Gareth began to implement iconik with the help of IVORY, who represents iconik in France.

A note on storage: As a storage option within iconik, Backblaze B2 offers teams storage that is both oriented to their use case and economically priced. THEMA needed simple, cloud-based storage with a price tag that was both right-sized and predictable, and in selecting Backblaze B2, they got it.

Today, THEMA uses iconik as a full content management system that offers nearly end-to-end control for their media workflows.

This is how they utilize iconik for their broadcast work:

1. Film and audio is created at the studios in Nigeria and Ghana.

2. The media is uploaded to Backblaze B2.

3. Backblaze B2 assets are then represented in iconik as proxies.

4. Quality control and compliance teams use the iconik Adobe panel with proxy versions for quality control, checking compliance, and editing.

5. Master files are downloaded to create the master copy.

6. The master copy is distributed for playout.

While all this is happening, the creative teams at THEMA can also access content in iconik to edit promotional media.

Visions to Expand iconik’s Use

With the experience THEMA has had so far, the team is excited to implement iconik for even more of their workflows. In the future, they plan to integrate iconik with their broadcast management system to share metadata and files with their playout system. This would save a lot of time and work, as much of the data in iconik is also relevant for the media playout system.

Further into the future, THEMA hopes to achieve a total end to end workflow with iconik. The vision is to use iconik as soon as a movie comes in, so their team can put it through all the steps in a workflow such as quality control, compliance, transcoding, and sending media to third parties for playout or VOD platforms.

For this global team that needed their media managed in a way that would be light and resource efficient, iconik—with the storage provided by Backblaze B2—delivered in a big way.

Looking for a similar solution? Get started with Backblaze B2 and learn more about our integration with iconik today.

The post Supporting Efficient Cloud Workflows at THEMA appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Rclone Power Moves for Backblaze B2 Cloud Storage

2020-11-03 Skip Levens

Post Syndicated from Skip Levens original https://www.backblaze.com/blog/rclone-power-moves-for-backblaze-b2-cloud-storage/

Rclone is described as the “Swiss Army chainsaw” of storage movement tools. While it may seem, at first, to be a simple tool with two main commands to copy and sync data between two storage locations, deeper study reveals a hell of a lot more. True to the image of a “Swiss Army chainsaw,” rclone contains an extremely deep and powerful feature set that empowers smart storage admins and workflow scripters everywhere to meet almost any storage task with ease and efficiency.

Rclone—rsync for cloud storage—is a powerful command line tool to copy and sync files to and from local disk, SFTP servers, and many cloud storage providers. Rclone’s Backblaze B2 Cloud Storage page has many examples of configuration and options with Backblaze B2.

Continued Steps on the Path to rclone Mastery

In our in-depth webinar with Nick Craig-Wood, developer and principal maintainer of rclone, we discussed a number of power moves you can use with rclone and Backblaze B2. This post takes it a number of steps further with five more advanced techniques to add to your rclone mastery toolkit.
Have you tried these and have a different take? Just trying them out for the first time? We hope to hear more and learn more from you in the comments.

Use `--track-renames` to Save Bandwidth and Increase Data Movement Speed

If you’re moving files constantly from disk to the cloud, you know that your users frequently re-organize and rename folders and files on local storage. Which means that when it’s time to back up those renamed folders and files again, your object storage will see the files as new objects and will want you to re-upload them all over again.

Rclone is smart enough to take advantage of Backblaze B2 Native APIs for remote copy functionality, which saves you from re-uploading files that are simply renamed and not otherwise changed.

By specifying the --track-renames flag, rclone will keep track of file size and hashes during operations. When source and destination files match, but the names are different, rclone will simply copy them over on the server side with the new name, saving you having to upload the object again. Use the --progress or --verbose flags to see these remote copy messages in the log.

rclone sync /Volumes/LocalAssets b2:cloud-backup-bucket \
–track-renames –progress –verbose

2020-10-22 17:03:26 INFO : customer artwork/145.jpg: Copied (server side copy)
2020-10-22 17:03:26 INFO : customer artwork//159.jpg: Copied (server side copy)
2020-10-22 17:03:26 INFO : customer artwork/163.jpg: Copied (server side copy)
2020-10-22 17:03:26 INFO : customer artwork/172.jpg: Copied (server side copy)
2020-10-22 17:03:26 INFO : customer artwork/151.jpg: Copied (server side copy)

With the --track-renames flag, you’ll see messages like these when the renamed files are simply copied over directly to the server instead of having to re-upload them.

Easily Generate Formatted Storage Migration Reports

When migrating data to Backblaze B2, it’s good practice to inventory the data about to be moved, then get reporting that confirms every byte made it over properly, afterwards.
For example, you could use the rclone lsf -R command to recursively list the contents of your source and destination storage buckets, compare the results, then save the reports in a simple comma-separated-values (CSV) list. This list is then easily parsable and processed by your reporting tool of choice.

rclone lsf –csv –format ps amzns3:/customer-archive-source
159.jpg,41034
163.jpg,29291
172.jpg,54658
173.jpg,47175
176.jpg,70937
177.jpg,42570
179.jpg,64588
180.jpg,71729
181.jpg,63601
184.jpg,56060
185.jpg,49899
186.jpg,60051
187.jpg,51743
189.jpg,60050

rclone lsf –csv –format ps b2:/customer-archive-destination
159.jpg,41034
163.jpg,29291
172.jpg,54658
173.jpg,47175
176.jpg,70937
177.jpg,42570
179.jpg,64588
180.jpg,71729
181.jpg,63601
184.jpg,56060
185.jpg,49899
186.jpg,60051
187.jpg,51743
189.jpg,60050

Example CSV output of file names and file hashes in source and target folders.

You can even feed the results of regular storage operations into a system dashboard or reporting tool by specifying JSON output with the --use-json-log flag.

In the following example, we want to build a report listing missing files in either the source or the destination location:

The resulting log messages make it clear that the comparison failed. The JSON format lets me easily select log warning levels, timestamps, and file names for further action.

{“level”:”error”,”msg”:”File not in parent bucket path customer_archive_destination”,”object”:”216.jpg”,”objectType”:”*b2.Object”,”source”:”operations
/check.go:100″,”time”:”2020-10-23T16:07:35.005055-05:00″}
{“level”:”error”,”msg”:”File not in parent bucket path customer_archive_destination”,”object”:”219.jpg”,”objectType”:”*b2.Object”,”source”:”operations
/check.go:100″,”time”:”2020-10-23T16:07:35.005151-05:00″}
{“level”:”error”,”msg”:”File not in parent bucket path travel_posters_source”,”object”:”.DS_Store”,”objectType”:”*b2.Object”,”source”:”operations
/check.go:78″,”time”:”2020-10-23T16:07:35.005192-05:00″}
{“level”:”warning”,”msg”:”12 files missing”,”object”:”parent bucket path customer_archive_destination”,”objectType”:”*b2.Fs”,”source”:”operations
/check.go:225″,”time”:”2020-10-23T16:07:35.005643-05:00″}
{“level”:”warning”,”msg”:”1 files missing”,”object”:”parent bucket path travel_posters_source”,”objectType”:”*b2.Fs”,”source”:”operations
/check.go:228″,”time”:”2020-10-23T16:07:35.005714-05:00″}
{“level”:”warning”,”msg”:”13 differences found”,”object”:”parent bucket path customer_archive_destination”,”objectType”:”*b2.Fs”,”source”:”operations
/check.go:231″,”time”:”2020-10-23T16:07:35.005746-05:00″}
{“level”:”warning”,”msg”:”13 errors while checking”,”object”:”parent bucket path customer_archive_destination”,”objectType”:”*b2.Fs”,”source”:”operations
/check.go:233″,”time”:”2020-10-23T16:07:35.005779-05:00″}
{“level”:”warning”,”msg”:”28 matching files”,”object”:”parent bucket path customer_archive_destination”,”objectType”:”*b2.Fs”,”source”:”operations
/check.go:239″,”time”:”2020-10-23T16:07:35.005805-05:00″}
2020/10/23 16:07:35 Failed to check with 14 errors: last error was: 13 differences found

Example: JSON output from rclone check command comparing two data locations.

Use a Static Exclude File to Ban File System Lint

While rclone has a host of flags you can specify on the fly to match or exclude files for a data copy or sync task, it’s hard to remember all the operating system or transient files that can clutter up your cloud storage. Who hasn’t had to laboriously delete macOS’s hidden folder view settings (.DS_Store), or Window’s ubiquitous thumbnails database from your pristine cloud storage?

By building your own customized exclude file of all the files you never want to copy, you can effortlessly exclude all such files in a single flag to consistently keep your storage buckets lint free.
In the following example, I saved a text file under my user directory’s rclone folder and call it with --exclude-from rather than using --exclude (as I would if filtering on the fly):

rclone sync /Volumes/LocalAssets b2:cloud-backup-bucket \
–exclude-from ~/.rclone/exclude.conf

.DS_Store
.thumbnails/**
.vagrant/**
.gitignore
.git/**
.Trashes/**
.apdisk
.com.apple.timemachine.*
.fseventsd/**
.DocumentRevisions-V100/**
.TemporaryItems/**
.Spotlight-V100/**
.localization/**
TheVolumeSettingsFolder/**
$RECYCLE.BIN/**
System Volume Information/**

Example of exclude.conf that lists all of the files you explicitly don’t want to ever sync or copy, including Apple storage system tags, Trash files, git files, and more.

Mount a Cloud Storage Bucket or Folder as a Local Disk

Rclone takes your cloud-fu to a truly new level with these last two moves.

Since Backblaze B2 is active storage (all contents are immediately available) and extremely cost-effective compared to other media archive solutions, it’s become a very popular archive destination for media.

If you mount extremely large archives as if they were massive, external disks on your server or workstation, you can make visual searching through object storage, as well as a whole host of other possibilities, a reality.

For example, suppose you are tasked with keeping a large network of digital signage kiosks up-to-date. Rather than trying to push from your source location to each and every kiosk, let the kiosks pull from your single, always up-to-date archive in Backblaze!

With FUSE installed on your system, rclone can mount your cloud storage to a mount point on your system or server’s OS. It will appear instantly, and your OS will start building thumbnails and let you preview the files normally.

rclone mount b2:art-assets/video ~/Documents/rclone_mnt/

Almost immediately after mounting this cloud storage bucket of HD and 4K video, macOS has built thumbnails, and even lets me preview these high-resolution video files.

Behind the scenes, rclone’s clever use of VFS and caching makes this magic happen. You can tweak settings to more aggressively cache the object structure for your use case.

Serve Content Directly From Cloud Storage With a Pop-up Web or SFTP Server

Many times, you’re called on to give users temporary access to certain cloud files quickly. Whether it’s for an approval, a file hand off, or whatever, this requires thinking about how to get the file to a place where the user can have access to it with tools they know how to use. Trying to email a 100GB file is no fun, and spending the time to download and move it to another system that the user can access can take up a lot of time.

Or perhaps you’d like to set up a simple, uncomplicated way to let users browse a large PDF library of product documents. Instead of moving files to a dedicated SFTP or web server, simply serve them directly from your cloud storage archive with rclone using a single command.

Rclone’s serve command can present your content stored with Backblaze via a range of protocols as easy for users to access as a web browser—including FTP, SFTP, WebDAV, HTTP, HTTPS, and more.

In the following example, I export the contents of the same folder of high-resolution video used above and present it using the WebDAV protocol. With zero HTML or complicated server setups, my users instantly get web access to this content, and even a searchable interface:

rclone serve b2:art_assets/video
2020/10/23 17:13:59 NOTICE: B2 bucket art_assets/video: WebDav Server started on http://127.0.0.1:8080/

Immediately after exporting my cloud storage folder via WebDAV, users can browse to my system and search for all “ProRes” files and download exactly what they need.

For more advanced needs, you can choose the HTTP or HTTPS option and specify custom data flags that populate web page templates automatically.

Continuing Your Study

Combined with our rclone webinar, these five moves will place you well on your path to rclone storage admin mastery, letting you confidently take on complicated data migration tasks with an ease and efficiency that will amaze your peers.

We look forward to hearing of the moves and new use cases you develop with these tools.

The post Rclone Power Moves for Backblaze B2 Cloud Storage appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Set Your Content Free With Fastly and Backblaze B2

2020-10-27 Elton Carneiro

Post Syndicated from Elton Carneiro original https://www.backblaze.com/blog/set-your-content-free-with-fastly-and-backblaze-b2/

Whether you need to deliver fast-changing application updates to users around the world, manage an asset-heavy website, or deliver a full-blown video streaming service—there are two critical parts of your solution you need to solve for: your origin store and your CDN.

You need an origin store that is a reliable place to store the content your app will use. And you need a content delivery network (CDN) to cache and deliver that content closer to every location your users happen to be so that your application delivers an optimized user experience.

These table stakes are simple, but platforms that try to serve both functions together generally end up layering on excessive complexity and fees to keep your content locked on their platform. When you can’t choose the right components for your solution, your content service can’t scale as fast as it needs to today and the premium you pay for unnecessary features inhibits your growth in the future.

That’s why we’re excited to announce our collaboration with Fastly in our campaign to bring choice, affordability, and simplicity to businesses with diverse content delivery needs.

Fastly: The Newest Edge Cloud Platform Partner for Backblaze B2 Cloud Storage

Our new collaboration with Fastly, a global edge cloud platform and CDN, offers an integrated solution that will let you store and serve rich media files seamlessly, free from the lock-in fees and functionality of closed “goliath” cloud storage platforms, and all with free egress from Backblaze B2 Cloud Storage to Fastly.

Fastly’s edge cloud platform enables users to create great digital experiences quickly, securely, and reliably by processing, serving, and securing customers’ applications as close to end-users as possible. Fastly’s edge cloud platform takes advantage of the modern internet, and is designed both for programmability and to support agile software development.

Get Ready to Go Global

The Fastly edge cloud platform is for any business that wants to serve data and content efficiently with the best user experience. Getting started only takes minutes: Fastly’s documentation will help you spin up your account and then help you explore how to use their features like image optimization, video and streaming acceleration, real-time logs, analytic services, and more.

If you’d like to learn more, join us for a webinar with Simon Wistow, Co-Founder & VP of Strategic Initiatives for Fastly, on November 19^th at 10 a.m. PST.

Backblaze Covers Migration Egress Fees

To pair this functionality with best in class storage and pricing, you simply need a Backblaze B2 Cloud Storage account to set as your origin store. If you’re already using Fastly but have a different origin store, you might be paying a lot of money for data egress. Maybe even enough that the concept of migrating to another store seems impossible.

Backblaze has the solution: Migrate 50TB (or more), store it with us for at least 12 months, and we’ll pay the data transfer fees.

Or, if you have data on-premise, we have a number of solutions for you. And if the content you want to move is less than 50TB, we still have a way to cut your egress charges from your old provider by over 50%. Contact our team for details.

Have data in an Amazon region in North America or Europe?
Migrate 50TB (or more), store it with us for at least 12 months,
and we’ll pay the data transfer fees.

Freedom to Build and Operate Your Ideal Solution

With Backblaze as your origin store and Fastly as your CDN and edge cloud platform, you can reduce your applications storage and network costs by up to 80% based on joint solution pricing vs. closed platform alternatives. Contact the Backblaze team if you have any questions.

The post Set Your Content Free With Fastly and Backblaze B2 appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.