Tag Archives: Featured

5 Compelling Reasons You Should Go IPO

Post Syndicated from original https://www.backblaze.com/blog/5-compelling-reasons-you-should-go-ipo/

We took Backblaze public one year ago tomorrow. Our IPO was a great day and the realization of 14 years of hard work by our team. Since then, we’ve executed on our plans, hit our targets, and continued to grow our team and our revenue. And yet, the markets have been tough sledding. For newly-public tech companies like us, as well as many of our peers, stock values have decreased by ~70% from their peak values last year. It’s hard for shareholders, employees, and the market.

Obviously I wish the last 10 months would have gone differently in the markets, who doesn’t? But when people ask me, (which happens a lot) “Do you still think the IPO was a good idea?” There’s no question in my mind that it was one of the best business decisions we’ve made at Backblaze.

In fact, the more that I think about our experience of taking the company public, the more I believe that the IPO should be part of every entrepreneur and business leader’s consideration set. A perception has developed that there are magical financial benchmarks that forbid some companies from listing, but we went public at a point in the evolution of our business when a lot of experts told us we couldn’t. We may have faced some headwinds others didn’t, but I’m convinced that the IPO isn’t just for folks with over $300m in revenue who’ve raised hundreds of millions of dollars in venture capital.

So, in keeping with our commitment to transparency about our business and some of the interesting, tough, and exciting stuff we’ve been through—long-time readers will remember my blog about almost getting acquired—I’ve decided to write about our IPO journey: What sucked, what didn’t, what shocked us, and what we learned. Along the way, I’ll share everything I can—metrics, worksheets, planning decks, and more. Not because I think we deserve a pat on the back or to celebrate what we did, but for two bigger purposes:

  1. I can remember what it feels like to be an early stage entrepreneur thinking that the only path to making the company you built successful was to seek out restrictive venture funding or seek out an acquisition. I want to offer folks—whether you’re considering starting a business or have already built one with tens of millions in revenue—that there is another path to consider. While doing an IPO isn’t right for everyone, I think considering an IPO, and positioning your business to go that way if the opportunity arises, is sound strategy.
  2. I believe that democratizing the IPO process will be healthier for businesses, markets, and investors. And I’m not the first: Bill Hambrecht is well known for his efforts to open IPOs to broader audiences as he did with companies like Google and Overstock.com. Tech is all about disrupting unnecessary complexity, and going public is more complex than an AWS invoice. In the mid-nineties, there were more than 8,000 publicly traded companies. By this September there were nearly 2,000 fewer companies listed, even after the boom we saw in 2020 and 2021. I don’t think that’s a good thing.

This blog series will be for everyone from those of you dreaming up your first idea, to startups still in stealth mode, to the thousands of companies with revenue in the tens of millions.

And if there’s anything I talk about here that’s confusing or that you want to hear more about, please ask in the comments. I’ll try to cover it in a future post.

Why Listen to Us?

Hot takes on building startups and raising funding are a dime a dozen—so if you’re skeptical, I get it. What we’ll share here is partially based on the experience we had building two prior technology companies, raising multiple rounds of venture capital, and successfully selling them through acquisition. However, more uniquely: We founded and essentially bootstrapped Backblaze all the way up to our IPO (before 2021 we had only taken $3M in outside funding). Even CNBC noted that we took a unique path to market, and yet with $65 million in recurring revenue in 2020, we made a successful public offering and raised over $100M in funding to continue growing our business. We’ve made this journey ourselves, we did it recently, and—in the spirit of transparency—we’re going to share the stories behind it.

Why an IPO Should Be in Your Business Consideration Set

Why should IPO readiness (the process of setting up your business to go public) and actually going public be in your playbook? I’m going to explore this concept deeply over the course of this series, but I’ll pause here to tell you the five most compelling reasons to be IPO ready, along with a few proof points from our own experience.

  • Build to Last: Starting and growing a company is hard. If you’re doing it, it’s probably because you’re passionate about solving some problems in the world. To be successful, you had to care about your vision, your product, your customers, and your team. If your company ends up acquired, the unique entity you created will vaporize. Taking your company public provides a path to building and running the company for the long-term, possibly outliving you.
  • Funding With the Right Strings Attached: Raising funding in an IPO requires selling a portion of your company, just as in any venture funding. The difference, however, is that in an IPO the equity you sell is common shares—everyone gets the same shares on the same terms. In private fund raises, the company sells “preferred shares” to investors which typically come with a variety of special rights giving investors the ability to have extra control over the company, get extra equity in the company, prevent the company from raising money from other investors, and more. Raising funding in an IPO is the ultimate “clean” fundraise.
  • Building a Real Business: If you’re building with an aim to be acquired, it’s nearly impossible to not establish a culture at the company where everyone is focused on “dumping” the business. By aiming for an IPO, it drives the mindset to build for sustainability. You’re more likely to create a business that can achieve profitability, scale, growth, and deliver value over the long haul. Also, going through the actual process of IPO readiness, along with the process of feeding your financials through a meat grinder of ROI modeling and outcome driven planning—both during and after the IPO—means you will position your business for even greater resilience going forward.
  • Credibility: When the five Backblaze founders talked about IPOs back in the day in a tiny apartment in Palo Alto, it felt like we were trying on our dad’s pants. Sure, we knew some companies went public—but it didn’t feel like something that was really accessible (even for a room of people that scaled and sold multiple companies). But we’re not the only people who feel this way: “Public” signals a level of accomplishment and evolution that’s hard to achieve as a private company. Being able to achieve an IPO proves a business’s capacity to operate and excel under intense pressure and scrutiny. And if anyone is uncertain about how we’re doing, they can just go grab the last 10-K to see our results.
  • Liquidity: This one is simple. If you’re not public, you can’t sell your stock on the open market. Once the company is public, you and your employees (and existing shareholders) can sell their shares if they so choose. It also provides the freedom and flexibility for each individual to make that decision on their own. Rather than having to sell the company (wherein usually everyone is forced to sell all their shares), this allows one person to decide to stay “all-in” and keep all their shares, another one to sell theirs, and a third to sell just a few shares.
The team in Times Square.

What’s Next?

If you’re intrigued, this is really only the tip of the iceberg. In future posts, I will dig into everything from the nitty gritty tactics—like how to build a board, how to build a banking syndicate (twice], and how to write an S-1—to the bigger stories—like how years of planning can hinge on a few hours of work, or why “testing the waters” might be better named “getting thrown to the sharks”.

Rest assured: If you think you’re not interested in going public, everything I share will have as much to do with how you build a better business that you can grow over time as it will with the guts of the IPO process. I hope it’s useful, and if there’s anything you hope I’ll address or anything specific that you’d like to learn more about, let me know in the comments.

The post 5 Compelling Reasons You Should Go IPO appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Backblaze Launches Comprehensive Partner Program

Post Syndicated from Elton Carneiro original https://www.backblaze.com/blog/backblaze-launches-comprehensive-partner-program/

Support from our partners is part of what makes Backblaze so easy to use for so many folks, and today we’re continuing our efforts to make working with us even easier with the launch of our new Partner Program.

For businesses, it cuts through the complexity and cost that may have stopped them from adopting cloud storage and backup. For partners—including resellers, integrators, managed service providers, and more—it boosts their array of cloud solutions and brings even more value to their clients. The program builds on our long commitment to develop new solutions for partners and help them grow their businesses.

Partner Program Offerings

The program provides new opportunities for four key partner groups: Channel Partners, Technology Partners, Managed Service Providers (MSPs), and Affiliates.

As part of this new program, Channel Partners can take advantage of special capacity-based pricing with B2 Reserve, as well as a self-service resource providing discounts, deal registration, and in-house support. It also offers training and education resources.

Technology Partners can enjoy complimentary solution expertise and joint go-to-market and co-branding opportunities. MSPs will notice the ease of the new admin console and the utility of in-house support, digital assets, training materials, and data sheets, not to mention the recurring 10% commissions on computer backup. And of course, Affiliates, too, can enjoy recurring 10% commissions.

With the launch of the program, Backblaze is doubling down on its commitment to its partners, proving why Backblaze has built its reputation on easy-to-use, affordable cloud storage.

“Ease of use and accessibility can have a significant impact for our partners and their business. We are continuously looking for ways to innovate and develop for our partners. Offering this easy, accessible, and efficient resource will strengthen our relationship with our customers.”
—Nilay Patel, Vice President of Sales, Backblaze

Visit our Partner page to learn more about the Partner Program, visit the Partner Portal, or get started as a new partner.

The post Backblaze Launches Comprehensive Partner Program appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Querying a Decade of Drive Stats Data

Post Syndicated from Pat Patterson original https://www.backblaze.com/blog/querying-a-decade-of-drive-stats-data/

Last week, we published Backblaze Drive Stats for Q3 2022, sharing the metrics we’ve gathered on our fleet of over 230,000 hard drives. In this blog post, I’ll explain how we’re now using the Trino open source SQL query engine in ensuring the integrity of Drive Stats data, and how we plan to use Trino in future to generate the Drive Stats result set for publication.

Converting Zipped CSV Files into Parquet

In his blog post Storing and Querying Analytical Data in Backblaze B2, my colleague Greg Hamer explained how we started using Trino to analyze Drive Stats data earlier this year. We quickly discovered that formatting the data set as Apache Parquet minimized the amount of data that Trino needed to download from Backblaze B2 Cloud Storage to process queries, resulting in a dramatic improvement in query performance over the original CSV-formatted data.

As Greg mentioned in the earlier post, Drive Stats data is published quarterly to Backblaze B2 as a single .zip file containing a CSV file for each day of the quarter. Each CSV file contains a record for each drive that was operational on that day (see this list of the fields in each record).

When Greg and I started working with the Parquet-formatted Drive Stats data, we took a simple, but somewhat inefficient, approach to converting the data from zipped CSV to Parquet:

  • Download the existing zip files to local storage.
  • Unzip them.
  • Run a Python script to read the CSV files and write Parquet-formatted data back to local storage.
  • Upload the Parquet files to Backblaze B2.

We were keen to automate this process, so we reworked the script to use the Python ZipFile module to read the zipped CSV data directly from its Backblaze B2 Bucket and write Parquet back to another bucket. We’ve shared the script in this GitHub gist.

After running the script, the drivestats table now contains data up until the end of Q3 2022:

trino:ds> SELECT DISTINCT year, month, day 
FROM drivestats ORDER BY year DESC, month DESC, day DESC LIMIT 1;
year | month | day 
------+-------+-----
 2022 |     9 |  30 
(1 row)

In the last article, we were working with data running until the end of Q1 2022. On March 31, 2022, the Drive Stats dataset comprised 296 million records, and there were 211,732 drives in operation. Let’s see what the current situation is:

trino:ds> SELECT COUNT(*) FROM drivestats;
   _col0 
-----------
 346006813 
(1 row) 

trino:ds> SELECT COUNT(*) FROM drivestats 
    WHERE year = 2022 AND month = 9 AND day = 30;
   _col0 
--------
 230897 
(1 row)

So, since the end of March, we’ve added 50 million rows to the dataset, and Backblaze is now spinning nearly 231,000 drives—over 19,000 more than at the end of March 2022. Put another way, we’ve added more than 100 drives per day to the Backblaze Cloud Storage Platform in the past six months. Finally, how many exabytes of raw data storage does Backblaze now manage?

trino:ds> SELECT ROUND(SUM(CAST(capacity_bytes AS bigint))/1e+18, 2)
FROM drivestats WHERE year = 2022 AND month = 9 AND day = 30;
 _col0 
-------
  2.62 
(1 row)

Will we cross the three exabyte mark this year? Stay tuned to find out.

Ensuring the Integrity of Drive Stats Data

As Andy Klein, the Drive Stats supremo, collates each quarter’s data, he looks for instances of healthy drives being removed and then returned to service. This can happen for a variety of operational reasons, but it shows up in the data as the drive having failed, then later revived. This subset of data shows the phenomenon:

trino:ds> SELECT year, month, day, failure FROM drivestats WHERE 
serial_number = 'ZHZ4VLNV' AND year >= 2021 ORDER BY year, month, 
day;
 year | month | day | failure 
------+-------+-----+---------
...
 2021 |    12 |  26 |       0 
 2021 |    12 |  27 |       0 
 2021 |    12 |  28 |       0 
 2021 |    12 |  29 |       1 
 2022 |     1 |   3 |       0 
 2022 |     1 |   4 |       0 
 2022 |     1 |   5 |       0 
...

This drive appears to have failed on Dec 29, 2021, but was returned to service on Jan 3, 2022.

Since these spurious “failures” would skew the reliability statistics, Andy searches for and removes them from each quarter’s data. However, even Andy can’t see into the future, so, when a drive is taken offline at the end of one quarter and then returned to service in the next quarter, as in the above case, there is a bit of a manual process to find anomalies and clean up past data.

With the entire dataset in a single location, we can now write a SQL query to find drives that were removed, then returned to service, no matter when it occurred. Let’s build that query up in stages.

We start by finding the serial numbers and failure dates for each drive failure:

trino:ds> SELECT serial_number, DATE(FORMAT('%04d-%02d-%02d', year, 
month, day)) AS date 
FROM drivestats 
WHERE failure = 1;
  serial_number  |    date    
-----------------+------------
 ZHZ3KMX4        | 2021-04-01 
 ZA12RBBM        | 2021-04-01 
 S300Z52X        | 2017-03-01 
 Z3051FWK        | 2017-03-01 
 Z304JQAE        | 2017-03-02 
...
(17092 rows)

Now we find the most recent record for each drive:

trino:ds> SELECT serial_number, MAX(DATE(FORMAT('%04d-%02d-%02d', 
year, month, day))) AS date
    FROM drivestats 
    GROUP BY serial_number;
  serial_number   |    date    
------------------+------------
 ZHZ65F2W         | 2022-09-30 
 ZLW0GQ82         | 2022-09-30 
 ZLW0GQ86         | 2022-09-30 
 Z8A0A057F97G     | 2022-09-30 
 ZHZ62XAR         | 2022-09-30 
...
(329908 rows)

We then join the two result sets to find spurious failures; that is, failures where the drive was later returned to service. Note the join condition—we select records whose serial numbers match and where the most recent record is later than the failure:

trino:ds> SELECT f.serial_number, f.failure_date
FROM (
    SELECT serial_number, DATE(FORMAT('%04d-%02d-%02d', year, month, 
day)) AS failure_date
    FROM drivestats 
    WHERE failure = 1
) AS f
INNER JOIN (
    SELECT serial_number, MAX(DATE(FORMAT('%04d-%02d-%02d', year, 
month, day))) AS last_date
    FROM drivestats 
    GROUP BY serial_number
) AS l
ON f.serial_number = l.serial_number AND l.last_date > f.failure_date;
  serial_number  | failure_date 
-----------------+--------------
 2003261ED34D    | 2022-06-09 
 W300STQ5        | 2022-06-11 
 ZHZ61JMQ        | 2022-06-17 
 ZHZ4VL2P        | 2022-06-21 
 WD-WX31A2464044 | 2015-06-23 
(864 rows)

As you can see, the current schema makes date comparisons a little awkward, pointing the way to optimizing the schema by adding a DATE-typed column to the existing year, month, and day. This kind of denormalization is common in analytical data.

Calculating the Quarterly Failure Rates

In calculating failure rates per drive model for each quarter, Andy loads the quarter’s data into MySQL and defines a set of views. We additionally define the current_quarter view to restrict the failure rate calculation to data in July, August, and September 2022:

CREATE VIEW current_quarter AS 
    SELECT * FROM drivestats
    WHERE year = 2022 AND month in (7, 8, 9);

CREATE VIEW drive_days AS 
    SELECT model, COUNT(*) AS drive_days 
    FROM current_quarter
    GROUP BY model;

CREATE VIEW failures AS
    SELECT model, COUNT(*) AS failures
    FROM current_quarter
    WHERE failure = 1
    GROUP BY model
UNION
    SELECT DISTINCT(model), 0 AS failures
    FROM current_quarter
    WHERE model NOT IN
    (
        SELECT model
        FROM current_quarter
        WHERE failure = 1
        GROUP BY model
    );

CREATE VIEW failure_rates AS
    SELECT drive_days.model AS model,
           drive_days.drive_days AS drive_days,
           failures.failures AS failures, 
           100.0 * (1.0 * failures) / (drive_days / 365.0) AS 
annual_failure_rate
    FROM drive_days, failures
    WHERE drive_days.model = failures.model;

Running the above statements in Trino, then querying the failure_rates view, yields a superset of the data that we published in the Q3 2022 Drive Stats report. The difference is that this result set includes drives that Andy excludes from the Drive Stats report: SSD boot drives, drives that were used for testing purposes, and drive models which did not have at least 60 drives in service:

trino:ds> SELECT * FROM failure_rates ORDER BY model;
        model         | drive_days | failures | annual_failure_rate 
----------------------+------------+----------+---------------------
 CT250MX500SSD1       |      32171 |        2 |                2.27 
 DELLBOSS VD          |      33706 |        0 |                0.00 
 HGST HDS5C4040ALE630 |       2389 |        0 |                0.00 
 HGST HDS724040ALE640 |         92 |        0 |                0.00 
 HGST HMS5C4040ALE640 |     341509 |        3 |                0.32 
 ...
 WDC WD60EFRX         |        276 |        0 |                0.00 
 WDC WDS250G2B0A      |       3867 |        0 |                0.00 
 WDC WUH721414ALE6L4  |     765990 |        5 |                0.24 
 WDC WUH721816ALE6L0  |     242954 |        0 |                0.00 
 WDC WUH721816ALE6L4  |     308630 |        6 |                0.71 
(74 rows)

Query 20221102_010612_00022_qscbi, FINISHED, 1 node
Splits: 139 total, 139 done (100.00%)
8.63 [82.4M rows, 5.29MB] [9.54M rows/s, 628KB/s]

Optimizing the Drive Stats Production Process

Now that we have shown that we can derive the required statistics by querying the Parquet-formatted data with Trino, we can streamline the Drive Stats process. Starting with the Q4 2022 report, rather than wrangling each quarter’s data with a mixture of tools on his laptop, Andy will use Trino to both clean up the raw data and produce the Drive Stats result set for publication.

Accessing the Drive Stats Parquet Dataset

When Greg and I started experimenting with Trino, our starting point was Brian Olsen’s Trino Getting Started GitHub repository, in particular, the Hive connector over MinIO file storage tutorial. Since MinIO and Backblaze B2 both have S3-compatible APIs, it was easy to adapt the tutorial’s configuration to target the Drive Stats data in Backblaze B2, and Brian was kind enough to accept my contribution of a new tutorial showing how to use the Hive connector over Backblaze B2 Cloud Storage. This tutorial will get you started using Trino with data stored in Backblaze B2 Buckets, and includes a section on accessing the Drive Stats dataset.

You might be interested to know that Backblaze is sponsoring this year’s Trino Summit, taking place virtually and in person in San Francisco, on November 10. Registration is free; if you do attend, come say hi to Greg and me at the Backblaze booth and see Trino in action, querying data stored in Backblaze B2.

The post Querying a Decade of Drive Stats Data appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

How to Download and Back Up Your Twitter Account

Post Syndicated from Barry Kaufman original https://www.backblaze.com/blog/how-to-download-and-back-up-your-twitter-account/

If you’ve been following the news lately, you might be thinking now is a good time to start downloading and backing up your Twitter history.

It’s officially the Elon Age of Twitter, and subsequently, there have been a few people rumbling about leaving the platform following Musk’s firing of top executives and an alarming rise in hate speech. Needless to say, we’re sticking around—you might have stumbled upon this article from Twitter itself. We just can’t quit the little blue bird quite yet. But there is one thing we can do, and that’s help you download and back up your Twitter archive—most likely for free.

Whether you’re anti-Elon or you’re just worried that the folks who are good at building electric cars or spaceships might not know how to manage a social media algorithm, you can take a few easy steps to protect your treasured Twitter memories. Here’s how.

Downloading Your Twitter Data

The first step is to log in to your Twitter account on a web browser. Once logged in, click on the “More” section in the navigation bar. From there, a new navigation bar will appear. You should select the “Settings and Support” dropdown, followed by the “Settings and Privacy” tab to progress.

Under the “Your Account” section, you will find an area labeled “Download an archive of your data.” The function of this is pretty self-explanatory, but does lead to a further menu that allows you to request an archive of your Twitter data or Periscope data.

After requesting your archive you will receive a notification with a link when your archive is ready for download. This archive will consist of a ZIP file with data that Twitter has deemed most relevant or useful to you, including DMs, moments, profile media and any media you may have used in your Tweets such as gifs, photos, and videos.

Archive Your Twitter Data for Free

Once you download your Twitter data, you can then save a full archive copy in the cloud on Backblaze B2—for free if it’s under 10GB.

Click here to get started with Backblaze B2 Storage Cloud today.

Back Up Your Twitter Data (Not Free, But Super Easy)

In addition to an archive copy, it’s important to use a secure backup strategy so all of those Tweets and memories will be preserved and kept safe from accidental deletion, equipment failure, or disasters (whether they’re natural or Musk-made). This is where a 3-2-1 backup strategy comes in handy. Using a 3-2-1 approach means keeping one copy of your data locally, one copy on a different type of media like an external hard drive, and one off-site (the cloud is a great place to keep it!).

You’ll need to manually download your Twitter data periodically, but once you have it on your machine, you can ensure it’s backed up with Backblaze Computer Backup—it automatically backs up all of your files, including documents, photos, music, movies, and, yes, all of that Twitter data you downloaded.

Click here to sign up for a 15-day trial of Backblaze Computer Backup, and save those Tweets.

While You’re At It…

We’ve gathered a handful of guides to help you protect social content across many different platforms. We’re working on developing this list—please comment below if you’d like to see another platform covered.

The post How to Download and Back Up Your Twitter Account appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Stocks & Storage: Demystifying Finance Jargon, One Acronym at a Time

Post Syndicated from James Kisner original https://www.backblaze.com/blog/stocks-storage-demystifying-finance-jargon-one-acronym-at-a-time/

Now that Backblaze [BLZE] is a public company, our team has been wrestling with lots of new Wall Street jargon: IPO, EBITDA, EPS—yes, these all mean things. And if you’ve ever thought to yourself, “I’d like to learn more about publicly traded companies,” or “I wonder what these tech earnings reports really mean,” only to be hit by a wall of acronyms like these when you try to do a little research: You’re not alone.

We figured, if these are going to be part of our lives now, why not share our growing understanding with you?

Enter: Stocks & Storage

This new video blog for average folks interested in investing in tech stocks. In this series, Yev Pusin, resident storage expert, and James Kisner, certified finance geek, will demystify the lingo that Wall Street and cloud storage businesses use to report on how and what they’re doing.

  • Their main goal: Simplify what should be simple.
  • Their secondary goal: Don’t be boring!

If you want to stay up to date on the latest videos, subscribe to our YouTube here, or scroll to the bottom of this page and enter your email address to get blog updates straight to your inbox.

The post Stocks & Storage: Demystifying Finance Jargon, One Acronym at a Time appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Backblaze Drive Stats for Q3 2022

Post Syndicated from original https://www.backblaze.com/blog/backblaze-drive-stats-for-q3-2022/

As of the end of Q3 2022, Backblaze was monitoring 230,897 hard drives and SSDs in our data centers around the world. Of that number, 4,200 are boot drives, with 2,778 SSDs and 1,422 HDDs. The SSDs were previously covered in our recently published Midyear SSD Report. Today, we’ll focus on the 226,697 data drives under management as we review their quarterly and lifetime failure rates as of the end of Q3 2022.

We’ll also take a look at the relationship between hard drive failure rates and hard drive cost. Along the way, we’ll share our observations and insights on the data presented, and, as always, we look forward to you doing the same in the comments section at the end of the post.

Q3 2022 Hard Drive Failure Rates

Let’s start with reviewing our data for the Q3 2022 period. In that quarter, we tracked 226,697 hard drives used to store data. For our evaluation, we removed 388 drives from consideration as they were used for testing purposes or drive models which did not have at least 60 drives. This leaves us with 226,309 hard drives grouped into 29 different models to analyze.

Notes and Observations on the Q2 2022 Stats

Zero failures for Q3: Three drives had zero failures this quarter: the 8TB HGST (model: HUH728080ALE604), the 8TB Seagate (model: ST8000NM000A), and the 16TB WDC (model: WUH721816ALE6L0). For the 8TB HGST, that was the second quarter in a row with zero failures. Of the three, only the WDC model has enough lifetime data (drive days) to be comfortable with the calculated annualized failure rate (AFR). As we will see later in this review, this 14TB WDC model has a lifetime AFR of 0.11% with the confidence interval range of just 0.30 at a 95% confidence level.

The new disks in town: There are two new models in this quarter’s data: the 8TB Seagate (model: ST8000NM000A) and the 16TB Seagate (model: ST16000NM002J). Neither has enough data to be interesting yet, but as noted above, the 8TB Seagate had zero failures in its first quarter in operation. These additions give us 29 different models we are tracking, up from 27 in the previous quarter.

The 29 models break down by manufacturer as:

  • HGST: 7 models
  • Seagate: 13 models
  • Toshiba: 6 models
  • WDC: 3 models

The chart below shows, by manufacturer, how our drive fleet has changed over the past six years.

The old guard is feeling old: All three of the oldest drives we currently use are showing signs of their age as each experienced an increase in AFR from Q2 to Q3 2022 as shown below.

MFG Model Size Q3 2022 Avg Age Q2 AFR Q3 AFR
Seagate ST4000DM000 4TB 83.1 3.42% 4.38%
Seagate ST6000DX000 6TB 89.6 0.91% 1.34%
TOSHIBA MD04ABA400V 4TB 88.3 0.00% 8.25%

Note that the 4TB Toshiba only had two failures in Q3 2022. The high AFR (8.25%) is due to the limited number of drive days in the quarter (8,849) from only 95 drives. For all three, it seems their spindles, actuators, and media are starting to wear out after seven years or so of constant spinning.

The Quarterly AFR continues to rise: The AFR for Q3 2022 was 1.64%, increasing from 1.46% in Q2 2022 and from 1.10% a year ago. As noted previously, this is related to the aging of the entire drive fleet and we would expect this number to go down as older drives are retired and replaced over the next year. A possible harbinger of what is to come can be seen in the 16TB models which as a group had an 0.80% AFR in Q3 2022. As these drives are used to replace the aging 4TB drives, the quarterly AFR should decrease.

Hard Drive Failure Versus Hard Drive Cost

One question that comes up is why we would continue to buy a drive model that has a higher annualized failure rate versus a comparably sized, but more expensive, model. Two primary reasons: First, we are able to do so as our cloud storage Backblaze Vault architecture is designed for drive failure. Second, by studying data like drive stats and such, we work hard to understand our environment from the inside out. Understanding the relationship between cost and drive failure is one of those learnings. Here’s a simple example below using three fictitious models of 14TB drives, Model 1, Model 2, and Model 3.

Let’s take a look at the different sections (i.e. blue rows) of this table.

Drive Cost: Each model has a different price: low ($225), medium ($250), and high ($275). We would buy the same number of drives (5,000) of each model and we get the cost of each model.

Annual Drive Failures: This is the AFR of each drive model. For this example, we assigned the lowest price model to the highest failure rate, the highest price model to the lowest failure rate, and so on. In practice, we would use our own AFR numbers for a given model that we are considering purchasing. Regardless, we get the annual number of failed drives for each model.

Annual Replacement Cost: Labor cost covers the human cost involved from identifying the failure to returning and replacing the drive. Drive cost is zero here as the assumption is that all drives are returned for credit or replacement to the manufacturer or their agent. A zero value here may not always be the case; hence the line item. In either case, the annual cost to replace the failed drives for each model is computed.

Lifetime Replacement Cost: Take the number of years you expect the drive model to be in service times the annual cost to replace the failed drives. All of this gets us the total cost of each drive model—the peach section. In our example, the most expensive model (Model 3) is the most expensive drive over the five-year life expectancy and the lowest cost drive model (Model 1) is the least expensive over the same period, even with a higher annualized failure rate.

But we’re not done. The next question is: What would the annualized failure rate for the least expensive choice, Model 1, need to be such that the total cost after five years would be the same as Model 2 and then Model 3? In other words, how much failure can we tolerate before our original purchase decision is wrong? When we crunch the numbers we come out with the following:

  • Model 1 and Model 2 have the same total drive cost ($1,325,000) when the annualized failure rate for Model 1 is 2.67%.
  • Model 1 and Model 3 have the same total drive cost ($1,412,500) when the annualized failure rate for Model 1 is 3.83%.

The model presented is a simplified version of how we think about drive purchase decisions using annualized drive failure rates as part of the equation. You can make this model more accurate, and complicated, by adding in the drive failure rate changes over time (the bathtub curve) and prorating the cost of returning failed drives over the years. Whether that is needed is up to you.

The need for such a model is important in our business if you are interested in optimizing the efficiency of your cloud storage platform. Otherwise, just robotically buying the most expensive, or least expensive, drives is turning a blind eye to the expense side of the ledger.

On an individual or small office/home office level, your drive purchasing decision requires a lot less math, and often comes down to what drive can you afford. Even so, you should still try to do some research. Our drive stats can help, but in all cases you should have a solid backup plan in place as no drive you can buy is failure proof.

Lifetime Hard Drive Failure Rates

As of September 30, 2022, Backblaze was monitoring 226,697 hard drives used to store data. For our evaluation, we removed 388 drives from consideration as they were used for testing purposes or drive models which did not have at least 60 drives. This leaves us with 226,309 hard drives grouped into 29 different models to analyze for the lifetime report.

Notes and Observations About the Lifetime Stats

The lifetime annualized failure rate for all the drives listed above is 1.41%. That is a slight increase from the previous quarter of 1.39%, but lower than one year ago (Q3 2021) which was 1.45%.

The usual caution should be applied to those drive models that have wide confidence intervals, one percent or greater. Such a gap indicates there is not enough data or that the data we do have is not readily predictable.

That said, we do have plenty of drive models for which we have solid data. Below we’ve extracted the 12TB, 14TB, and 16TB models from the lifetime table above that have a Lifetime AFR of less than 1% and have a confidence interval of 0.5% or less. These are hard drives which, up to this point, have shown solid reliability in our environment.

The Hard Drive Stats Data

The complete data set used to create the information in this review is available on our Hard Drive Test Data page. You can download and use this data for free for your own purpose. All we ask are three things: 1) you cite Backblaze as the source if you use the data, 2) you accept that you are solely responsible for how you use the data, and 3) you do not sell this data to anyone; it is free.

If you want the tables and charts used in this report, you can download the .zip file from Backblaze B2 Cloud Storage which contains the .jpg and/or .xlsx files as applicable.

Good luck, and let us know if you find anything interesting.

The post Backblaze Drive Stats for Q3 2022 appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Ransomware Takeaways From Q3 2022

Post Syndicated from Jeremy Milk original https://www.backblaze.com/blog/ransomware-takeaways-from-q3-2022/

No matter which way war, the global economy, or superstorms are headed, one thing remains constant: ransomware threats continue to persist and evolve. That’s not new information, of course, but understanding the sophistication of emerging attacks is useful for anyone responsible for defending vulnerable infrastructure. Cybercriminals continue to target more industries such as healthcare and education that might not be as well-equipped to defend themselves. New strategies have allowed them to do more damage.

The landscape continues to change, but staying informed is one of the best ways to protect your organization against the ever-present threat of ransomware. It’s no substitute for comprehensive training for your team and a safely object-locked backup solution, but it never hurts to know too much. Here are a few of the biggest stories in ransomware from Q3.

This post is a part of our ongoing series on ransomware. Take a look at our other posts for more information on how businesses can defend themselves against a ransomware attack, and more.

➔ Download The Complete Guide to Ransomware

1. Threats to “Soft Targets” Are Growing

With businesses ramping up their ransomware protection, cybercriminals have begun shifting toward more so-called “soft targets” including hospitals and small municipal governments. This has proven dangerous, as not only do these targets have fewer resources to devote to cybersecurity, but a compromise of their systems can lead to real-world disaster.

Three different hospitals around the country—CHI Memorial Hospital in Tennessee, hospitals in the St. Luke’s system within Texas, and Virginia Mason Franciscan Health in Seattle—were all recently hit with ransomware attacks, causing widespread delays in patient care. This has become a sadly common story, as attacks continue on healthcare targets.

Ransomware groups have increasingly been targeting school systems as well. One such group, The Vice Society, were recently the subject of an FBI warning, identifying their activity as “disproportionately targeting the education sector” and that those attacks against school districts “may increase as the 2022/2023 school year begins and criminal ransomware groups perceive opportunities for successful attacks.”

Key Takeaway: No vertical is safe from the threat of ransomware, but the rise of these threats has led to greater protections specifically for soft target sectors. Cybersecurity and Infrastructure Security Agency (CISA) has provided a wealth of tools for education, and companies have begun pivoting to create budget-friendly options for cash-strapped public sector CIOs.

2. Ransomware Gangs May Now Be Deploying “Triple Extortion”

This past quarter saw several high-profile attacks against larger businesses, including Cisco, Uber, and Rockstar Games, but it also saw signs that the ongoing war between black hat and white hat hackers may be entering a new realm.

In June, LockBit Ransomware was able to infect systems at Entrust, giving the ransomware gang access to nearly 300GB of data which they threatened to publish if their demands were not met. Entrust did not pay the ransom, and while the company did not claim credit for it, someone shortly after launched a DDoS attack against the site that LockBit was going to use to publish the data.

In retaliation, the Lockbit ransomware gang began actively recruiting DDoSers to begin executing a “triple extortion” tactic, layering the possibility of a DDoS attack on top of attacks via ransomware. In a post to a popular forum for black hat hackers, LockBit’s public face LockBitSupp wrote, “have felt the power of dudos [DDoS] and how it invigorates and makes life more interesting.”

Key Takeaway: Time and time again we see hackers creating new tactics, and simple non-negotiation doesn’t protect your business or solve for operational downtime. We’ve seen that paying ransoms doesn’t stop attacks, and engaging in counterattacks rarely has the desired outcome. Strong defensive strategies, like object lock capability, can’t block cybercriminals from accessing and publishing information, but it does ensure that you have everything you need to bring your business back online as quickly as possible.

3. The Geopolitical Landscape is Impacting Cybercrime

The Council on Foreign Relations recently released a bombshell report titled, “Confronting Reality in Cyberspace: Foreign Policy for a Fragmented Internet” that outlined the extent to which state-sponsored hackers have begun undermining American sovereignty through attacks. This dovetails with recent reports of the information wars between Russia and Ukraine spilling out beyond the battlefield. A report from Wired showed how pro-Russia group Killnet has launched cyberattacks against 10 different countries for supporting Ukraine.

This isn’t necessarily new information: the 2020 Homeland Security Threat Assessment calls out several nations, including Russia, China, North Korea, and Iran, as likely to employ cybersecurity attacks against the U.S. What is new is that the Senate voted $45 million in support of cybertools that are specifically earmarked to protect the U.S. power grid. Some groups—including the U.S. Government Accountability Office—don’t think that we’re doing enough. The impact here is that we’re not just talking about ransomware attacks exposing private data; we’ve evaluated as likely, and have started protecting ourselves against, attacks that will functionally shut down basic utilities.

Key Takeaway: As the lines blur between malicious hacking and state-sponsored attacks, the sophistication of the threats faced by most businesses and individuals will only grow. New laws and policies may eventually emerge to combat this trend, but until then it will be on you to ensure your infrastructure is safe.

The Bottom Line

The threat of cybercrime will only continue to expand in coming years. No matter what industry you’re in or what size organization’s infrastructure you have been tasked with protecting, continuous vigilance is crucial.

The post Ransomware Takeaways From Q3 2022 appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Backblaze at Educause ’22: Fueling Innovation in Higher Education

Post Syndicated from Jennifer Newman original https://www.backblaze.com/blog/backblaze-at-educause-22-fueling-innovation-in-higher-education/


Like many industries, higher education has spent the last decade discovering the transformative power of the cloud and moving into the next century. The cost savings of running a more efficient tech stack and easier access to the vital data it contains have allowed those institutions to pursue the practical and academic discoveries they were built for.

Graduating to Cloud Storage

Across the board, colleges and universities are pushing the boundaries of what cloud storage can do—and their creativity is paying huge dividends for their efficiency and security. We’ve included a few examples below that show just how these institutions have been able to maximize their cloud storage capabilities to reduce costs, modernize outdated operations, protect sensitive student and research data, and extend their ability to provide knowledge to a wider audience.

Citing Our Work

Pittsburg State: Located in Kansas, this university found themselves with nearly five decades of data in harm’s way due to the constant threat of tornadoes. Adding off-premises storage with Backblaze B2 not only gave them the geographical separation they needed, but the addition of a virtual air gap through Object Lock quadrupled their protection against ransomware.

Coast Community College District: CCCD aiming to update its data management system and eliminate costly delays from tape backups. Their existing tapes needed to be physically chauffeured between the three colleges in the district—Coastline Community College, Golden West College, and Orange Coast College—in friendly L.A. traffic. Backblaze B2’s S3 Compatible APIs made for a seamless integration with Cohesity backup.

UCSC–Silicon Valley: A 22-person video production team at the university’s online learning program, UC–Scout had quickly reached their storage capacity after archiving thousands of videos. By leveraging Backblaze B2, their IT team was able to streamline the entire production process, saving money and unleashing the team’s full creative potential.

Kanopy: The “Netflix for libraries” overhauled its tech stack in order to share its massive selection of more than 25,000 videos with thousands of schools and public libraries. After migrating to Backblaze B2, Kanopy was able to scale efficiently and rapidly accelerate content onboarding.

Gladstone Institutes: Gladstone Institutes needed an affordable, reliable backup system that would allow their researchers to focus on the life-saving developments they were pursuing in the lab. Cloud storage’s increased reliability allowed them to move away from LTO, and off-premise storage shielded their findings from the potential for natural disasters.

Office Hours—No Lectures

If you’re planning to attend EduCause ’22, you can learn more about the many possibilities Backblaze opens up in higher education. Through a new partnership between Backblaze and Carahsoft, public sector customers can now leverage their existing state, local, and federal buying programs to access Backblaze B2 Cloud Storage.

In addition to a live demo of Instant Recovery in the Veeam booth, we’re proud to sponsor the Carahsoft Happy Hour Reception. With special cocktails you won’t find anywhere else (try the “Backblaze Special;” you’ll love it), this is a great opportunity to network with fellow educators and learn more about how Backblaze can help you leverage your tech stack.

The post Backblaze at Educause ’22: Fueling Innovation in Higher Education appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

How to Download and Back Up Dropbox Data

Post Syndicated from Lora Maslenitsyna original https://www.backblaze.com/blog/how-to-download-and-back-up-dropbox-data/

If you’ve ever told an IT professional that you’re using Dropbox to back up files and were greeted with a side eye and a stifled “well, actually…” it’s because Dropbox isn’t actually a backup. It’s for syncing data. The distinction is subtle, but critical.

If you’re reading this post, you probably already know that data is always at risk of loss to accidental deletion, system updates, or even if you forget your password and get locked out of your account. The difference between backing up and syncing is that syncing your data will not protect it from these risks.

It’s easy to accidentally lose access to a sync service where you might be keeping files or images that no longer live on your computer. Many colleges and universities now even offer file hosting service subscriptions to students for free—until they graduate. After students earn their diplomas and leave the dorms, these services graduate, too, and students either get locked out of their accounts or have to choose between switching to a free tier and compromising on storage space or paying the fees to keep their existing subscription tier.

To make sure your data stays safe and secure, you’ll want to make sure you have a copy of it on your local device as well as a copy backed up to the cloud. A 3-2-1 backup strategy is always your best bet for securely storing your data. In this post, we’ll walk you through downloading your data from Dropbox and some strategies for backing up your downloaded files.

Back Up Everything But the Kitchen Sync

As we mentioned earlier, saving your data to a sync service is not the same as backing it up. Sync and backup services are complimentary, but only a backup will save a copy of your data and keep it safe against accidental deletion, updates, a ransomware attack, and more.

To help you save your synced computer data, we’re developing a series of guides to downloading and backing up your data across different sync services, like OneDrive. Comment below to let us know what other sync services you’d like to see us cover.

How to Download Files From Dropbox

Note: If you are using the Dropbox client to sync the files that are on your computer, the option to download your files may be replaced by an option to open them, instead. Clicking on “Open” will open up the files directly from the file on your computer where they are saved.

To download a file or folder from Dropbox, follow these steps:

    1. Sign in to your Dropbox account. (We know, this is pretty self-evident. We’re just trying to be thorough here).
    2. Find the file or folder you’d like to download and hover your cursor over it.
    3. Click on the three dots.
    4. Select Download. Your files will appear in the Downloads folder on your computer, and folders will be downloaded as .zip files.

    It’s also important to note that Dropbox only supports downloads of folders that are less than 20GB and contain fewer than 10,000 total files.

    How to Back Up Your Dropbox Data

    Now that you have all of your Dropbox files downloaded to your computer, you’ll want to follow through with the next steps of the 3-2-1 backup strategy. By saving a copy of your data on an external or secondary device (like a hard drive), and a third copy in an off-site location (like the cloud) your data will be protected from any number of possible risks. Backblaze Personal Backup automatically and continuously backs up a copy of all of the data on your computer to the cloud, making it that much easier to fulfill the 3-2-1 backup strategy.

    Bonus: How to Export a File From Dropbox to an App on Your Phone or Mobile Device

    If you want to send a portion of your files elsewhere for safekeeping, or to share with another app, you can follow the set of instructions below. Just remember that downloading your files to your phone or emailing them to yourself isn’t the same as keeping a full copy of your data on an external device—your data is still susceptible to damage or loss.

    First, you’ll need to download the Dropbox mobile app to access your synced files on your mobile device.

    1. Open the app and select the three dots next to the file or folder you’d like to export. On an iPhone or iPad, the dots will appear horizontal, and on an Android device they’ll be vertical.
    2. Select Share.
    3. Select Export file, which will show a list of apps that can open the file. Choose the app you’d like to open the file. Note: once you export the file, if you make any changes to that file in the other app, those changes may not be saved back to your Dropbox account unless the app integrates with Dropbox.

    Back Up Your Dropbox Before It’s Too Late

    Have a lot of Dropbox data you don’t want to take up space on your computer? Upload and store your data in Backblaze B2 Cloud Storage as a part of your 3-2-1 approach. Also, let us know in the comments if you’d like to see more guides to downloading and backing up the data saved to other sync services.

The post How to Download and Back Up Dropbox Data appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Media Workflowing in The Big Apple: NAB Show New York Preview

Post Syndicated from Jeremy Milk original https://www.backblaze.com/blog/media-workflowing-in-the-big-apple-nab-show-new-york-preview/


You can send media in milliseconds to just about every corner of the earth with an origin store at your favorite cloud storage company and a snappy CDN. Sadly, delivering people across continents is a touch more complicated and time intensive. Nevertheless, the Backblaze team is saddling up planes, trains, and automobiles to bring the latest on media workflows to the attendees of NAB Show New York. Whether you’re there in person or virtually, we’ll be discussing and demo-ing all the newest Backblaze B2 Cloud Storage solutions that will ensure your data can travel with ease—no mass transit needed—everywhere you need it to be.

Learn More LIVE in NYC

If you’re attending the NAB Show New York, join us in booth 1239 to learn about integrating B2 Cloud Storage into your workflow. Stop by anytime or you can schedule a meeting here. We’d love to see you.

NAB Show New York Preview: What’s New for Backblaze B2 Media Workflow Solutions

Our booth will have all the goodness you’d expect of us: partners, friendly faces, spots to take a load off and talk about making your data work harder, and, of course, some next-level SWAG. Let’s get into what you can expect.

New Pricing Models and Migration Tools

Our team is on hand to talk you through two new offerings that have been generating a lot of excitement among teams across media organizations:

  • Backblaze B2 Reserve: You can now purchase the Backblaze service many know and love in capacity-based bundles through resellers. If your team seeks 100% budget predictability with transaction fees and premium support included, you should check out this new offering. Check it out here.
  • Universal Data Migration: Recently an International Broadcasting Convention (IBC) 2022 Best of Show nominee, the service makes it easy and FREE to move data into Backblaze from legacy cloud, on-premises, and LTO/tape origins. If your current data storage is holding your team or your budget back, we’ll pay to free your media and move it to B2 Cloud Storage. Learn more here.

Six Flavors of Media Workflow Deep Dives

We’ve gathered materials and expertise to discuss or demo our six most asked about workflow improvements. We’re happy to talk about many other tools and improvements, but here are the six areas we expect to talk about the most:

  1. Moving more (or all) media production to the cloud. Ensuring everyone—clients, collaborators, employers, everyone—has easy real-time access to content is essential for the inevitable geographical distribution of modern media workflows.
  2. Reducing costs. Cloud workflows don’t need to come with costly gotchas, minimum retention penalties, and/or high costs when you actually want to use your content. We’ll explain how the right partners will unlock your budget so you can save on cloud services and spend more on creative projects.
  3. Streamlining delivery. Pairing cloud storage with the right CDN is essential to making sure your media is consumable and monetizable at the edge. From streaming services to ecommerce outlets to legacy media outlets, we’ve helped every type of media organization do more with their content.
  4. Freeing storage. Empty your expensive on-prem storage and stop adding HDs and tapes to the pile by moving finished projects to always-hot cloud storage. This doesn’t just free up space and money: Instantly accessible archives means you can work with and monetize older content with little friction in your creative process.
  5. Safeguarding content. All those tapes or HDs on a shelf, in the closet, or wherever you keep them are hard to manage and harder to access and use. Parking everything safely and securely in the cloud means all that data is centrally accessible, protected, and available for more use.
  6. Backing up (better!). Yes, we’ve got roots in backup going back >15 years—so when it comes to making sure your precious media is protected with easy access for speedy recovery, we’ve got a few thoughts (and solutions).

Partners, Partners, and More Partners…

“The more we get together, the happier we’ll be,” might as well be the theme lyric of cloud workflows. Combining best of breed platforms unlocks better value and functionality, and offers you the ability to build your cloud stack exactly how you need it for your business. We’ve got a large ecosystem of Alliance Partners, and we’re happy to get deep into your needs and demo how you can combine Backblaze B2 Cloud Storage with one or more partners including iconik, LucidLink, Synology (who will also be right next to us in the Javits Center!), and Fastly to best achieve your objectives.

Hoping to visit NAB Show New York but not yet registered? All good. You can register free on the NAB site with promo code NY4429.

Hoping We Can Help You Soon

Whether it’s in person at NAB Show New York or virtually when it works for you, we’d love to walk you through any of the solutions we can serve for hardworking media teams. If you will be in Manhattan, schedule a meeting to ensure you’ll get the right expert on our team, then stick around for the swag and good times. This invitation applies to you too, Channel Partners and Resellers—whether you have active projects or just want to learn more, let’s meet up and chat about ways to deliver more value together. If you’re not making the trip, not a problem. Just contact us here so we can arrange to help virtually.

The post Media Workflowing in The Big Apple: NAB Show New York Preview appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Welcoming Chief Human Resources Officer Robert Fitt to Backblaze

Post Syndicated from Patrick Thomas original https://www.backblaze.com/blog/welcoming-chief-human-resources-officer-robert-fitt-to-backblaze/

Backblaze is happy to announce that Robert Fitt has joined our team as our first Chief Human Resources Officer (CHRO). Robert will lead the company’s strategic advancement for all aspects of human resources (HR), including hiring, people management and development, engagement, health and wellness initiatives, and outreach to the community.

We’re Growing—But We’re Still The Same Backblaze

Backblaze is recognized for talent retention and company culture—in the past few years we’ve received numerous awards for culture, diversity, and leadership from places like Comparably, Inc., Great Place to Work, and others. The addition of a seasoned CHRO will help us to continue this excellent trend as well as enabling our next phase of growth initiatives following our recent IPO in November 2021.

“Culture is critical in times of rapid growth and we want to continue scaling our world-class organization and great team alongside our growth as the leading independent storage cloud.” Gleb Budman, our CEO and Chairperson commented. “Robert is an experienced leader with the skills to help us do that. We are excited to welcome him to Backblaze.”

The Skills Robert Brings to Backblaze

Robert has a long track record of success in helping organizations scale rapidly while also championing healthy company culture. His executive experience includes leading HR functions at Turntide Technologies, 360 Behavioral Health, Mobilite, Broadcom Corporation and others. Additionally, Robert founded and managed Green Talent Co, an independent talent and HR advisory firm. Robert has scaled and led HR teams across the US, Canada, Asia, and Europe in software, hardware, telecomms, and healthcare industries. He brings a people first philosophy to everything he does, and is fiercely passionate about the employee and candidate experience.

“I’m proud to be joining a company that is committed to developing talent at all levels of the organization,” Robert said in reference to joining Backblaze. “I’m looking forward to working with leaders who have been recognized for promoting diversity, culture, and inclusion as we continue to focus on people and culture as a strategic priority.”

Robert earned his bachelor’s degree in Human Resources Management from Staffordshire University, and his Master’s degree in employment law from the University of East Anglia. He also volunteers as a pro bono HR consultant for Catchafire, a social good platform that matches professionals with nonprofits to volunteer their services.

Originally from the UK, Robert and his family hit the road 14 years ago and he is now based in Los Angeles, where he resides with wife, three boys, (ages 8, 17, 20) and two dogs. Taking advantage of the wonderful Southern California weather, Robert enjoys keeping fit, cycling, and sharing his eclectic music taste.

Backblaze Is Hiring

From day one, Backblaze has worked hard to bring our values to life, creating a transparent, sustainable, innovative (and dare we say flat-out good) place to work. Want to join the team? Check out our open opportunities.

The post Welcoming Chief Human Resources Officer Robert Fitt to Backblaze appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Announcing Tech Day ‘22: Live Tech Talks, Demos, and Dialogues

Post Syndicated from Amrit Singh original https://www.backblaze.com/blog/announcing-tech-day-22-live-tech-talks-demos-and-dialogues/

For those looking to build and grow blazing applications and do more with their data, we’d like to welcome you to this year’s Tech Day ‘22. We have a great community that works with Backblaze B2 Cloud Storage, including our internal team, IT professionals, developers, tech decision makers, cloud partners and more—and we felt it was high time to bring you all together again to share ideas, discuss upcoming changes, win some swag, and network.

Join our Technical Evangelists in live interactive sessions, demos, and tech talks that help you unlock your cloud potential and put B2 Cloud Storage to work for you. Whatever your role in the tech world—or if you’re simply curious about leveraging the Backblaze B2 platform—we invite you to join us!

➔ Register Now

Here’s What to Expect at Tech Day ’22

Tech Day ’22 is happening October 31, 10 a.m. PT. Can’t make it? Sign up anyway and we’ll share the event recording straight to your inbox.

IaaS Unboxed

A live chat about leveraging the independent cloud ecosystem for storage, compute, delivery, and backup, along with a customer showcase.

Sneak Peek

An early look at the Q3 2022 Drive Stats data with Andy Klein as he walks through the latest learnings to inform your thinking and purchase decisions.

Hands-On Demos

Pat Patterson (Chief Technical Evangelist), and Greg Hamer (Senior Developer Evangelist) team up to facilitate an action-packed set of interactive sessions aimed at helping you do more in the cloud. If you don’t have an account already, you’ll definitely want to create a free Backblaze B2 account so you can follow along. All you need to do is sign up with your email and create a password—it’s really that easy.

  • Scaling a Social App with Appwrite: Appwrite is a self-hosted backend-as-a-service platform that provides developers with all the core APIs required to build any application. Appwrite’s storage abstraction allows developers to store project files in a range of devices, including Backblaze B2. In this session, you’ll learn how to get started with Appwrite, and quickly build a social app that stores user-generated content in a Backblaze B2 Bucket.
  • Go Serverless with Fastly Compute@Edge: Fastly has long been a Backblaze partner—mutual customers are able to serve assets stored in Backblaze B2 Buckets via Fastly’s global content delivery network with zero download charges from Backblaze B2. Compute@Edge leverages Fastly’s network to enable developers to create high-scale, globally-distributed applications, and execute code at the edge. Discover how to build a simple serverless application in JavaScript and deploy it globally with a single command.
  • Provisioning Resources with the Backblaze B2 Terraform Provider: Hashicorp Terraform is an open-source infrastructure-as-code software tool that enables you to safely and predictably create, change, and improve infrastructure. Learn how our Terraform Provider unlocks Backblaze B2’s capabilities for DevOps engineers, allowing you to create, list, and delete Buckets and application keys, as well as upload and download objects.
  • Storing and Querying Analytical Data with Trino: Trino is a SQL-compliant query engine that supports a wide range of business intelligence and analytical tools, allowing you to write queries against structured and semi-structured data in a variety of formats and storage locations. We’ll share how we optimized Backblaze’s Drive Stats data for queries and used Trino to gain new insights into nine years of real-world data.

And So Much More

Join the live Q&A and our user community of tech leaders, IT pros, and developers like you. Register for free to grab your spot (and swag) and we’ll see you on October 31.

➔ Register Now

The post Announcing Tech Day ‘22: Live Tech Talks, Demos, and Dialogues appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

The Storage Pod Story: Innovation to Commodity

Post Syndicated from original https://www.backblaze.com/blog/the-storage-pod-story-innovation-to-commodity/

It has been over six years since we released Storage Pod 6.0. Yes, we have improved that system since then, several times. We’ve added more memory, upgraded the CPU, and of course deployed larger disks. I suppose we could have written blog posts about those improvements, a Storage Pod 6.X post or two or three, but somehow that felt a bit hollow.

About 18 months ago, we talked about The Next Backblaze Storage Pod. We had started using Dell servers in our Amsterdam data center, although we were still building and deploying the version 6.X storage pods in our U.S. data centers. That changed about six months ago and we haven’t built or deployed a Backblaze Storage Pod since that time. Here’s what we’ve done instead.

A Backblaze-Worthy Storage Server

In September of 2019, we wrote a blog post to celebrate the 10 year anniversary of open sourcing our Storage Pod design. In that post we mused about the build/buy decision and stated the criteria we needed to consider if we were going to buy storage servers from someone else: cost, ease of maintenance, the use of commodity parts, ability to scale production, and so on. Also in that post, we compiled a list of storage servers on the market at the time which were similar to our Storage Pod design.

We then proceeded to test several different storage servers from the list and elsewhere. The testing was done over a period of about a year using the criteria noted earlier. The process progressed and one server, a 60-drive Supermicro server, was selected to move on to the next stage, production performance testing.

Here we would observe the server’s performance and test its compatibility with our operational architecture. We built a vault of 20 Supermicro servers and placed it into production, and at the same time we placed a standard Storage Pod vault into production. The two vaults ran the same software and we would track each vault’s performance throughout.

When a Backblaze Vault enters production, 60 tomes of storage come online at the same time joining thousands of other tomes ready to receive data. Each vault has the same opportunity to load data, but this will vary depending on the performance of the vault to process the requests received. In general, the more performant the vault, the more data it can upload each day.

The comparison of how much data each vault uploaded each day is shown below. Vault 1084 is composed of 20 Supermicro servers and Vault 1085 is composed of 20 Backblaze Storage Pods.

The Supermicro vault (1084) started with a limit of 2,500 simultaneous connections allowed for the first seven days. Once that limit was lifted and both vaults were set to 5,000 simultaneous connections, the Supermicro vault generally outperformed the Backblaze vault over the remainder of the observation period.

What happened to the data once the test was over? It stayed in the Supermicro vault and that vault became a permanent part of our production environment. It is still in operation today, joined by over 1,100 additional Supermicro servers. Safe to say, we moved ahead with using the Supermicro servers in our environment in place of building new Storage Pods.

The Server Model We Use

The Supermicro model we order from Supermicro is the PIO-5049P-E1CR60L (PIO-5049). That model is not sold via the Supermicro website. That said, model SSG-6049P-E1CR60L (SSG-6049) is similar and is widely available. Both models have 60 drives, but the chassis is slightly different, and the motherboards are different with the PIO-5049 model having a single CPU slot, and the SSG-6049 model having two CPU slots. Let’s compare the basics of the two models below.

In practice, the Supermicro SSG-6049 model supports newer components such as the latest CPUs and allows more memory versus the Supermicro PIO-5049 model, but the latter is more than capable of supporting our needs.

Can You Build It?

A little over 13 years ago, we wrote the Petabytes on a Budget blog post introducing Backblaze Storage Pods to the world and open sourcing the design. Since then, many individuals, organizations, and businesses have taken the various storage pod designs we published over the years and built their own storage servers. That’s awesome.

We know building a Storage Pod was not easy. Oh, the assembly was simple enough, but getting all the parts you needed was a challenge: searching endlessly for 5-port backplanes (minimum order quantity 1,000-ouch, sorry) or having to build your own power supply cables. While many of you enjoyed the challenge; many didn’t.
For the Supermicro system, let’s work with the Supermicro SSG-6049 model as it is available to everyone and see what it would take for you to acquire/assemble/build a single Supermicro storage server.

Option One: Go Standard

The easiest thing to do is to order a pre-configured SSG-6049 model from Supermicro or you can try one of their online reseller sites such as Canada Computers & Electronics or ComputerLink, which offer the same “complete system”. In these cases, the ability to customize the server is minimal and requires direct contact with the vendor for most changes. If that works for you, then you’re all set.

Option Two: Configure

If you want to design your own system you can try Supermicro resellers such as IT Creations (US) and Server Simply (EU) which have configurators that allow you to select your CPU, motherboard, network cards, memory, and various other components. This is a great option but given the number of different options and the possibility of incompatibilities between components, you need to be careful here. Don’t rely on the configurator to catch a component mismatch.

Option Three: Create

Here you might buy the most stripped-down server you can find and replace nearly everything inside—motherboard, CPU, fans, switches, cables and so on. You’ll probably void any warranty you had on the system, but we suspect you knew that already. Regardless, you can take the base system and stuff it full of smoking-fast everything so that your copy of “Ferris Buellers Day Off” downloads in picoseconds. That’s the fun part of building your own storage server, when you are done it is uniquely yours.

Which option you choose is, of course, your choice, and while ordering a standard system from Supermicro may not be as satisfying as soldering heat sinks to the motherboard or neatly tying off your SATA cable runs, it will give you more time to watch Ferris, so there’s that.

FYI, Supermicro has an extensive network of resellers around the world. While the options above fall neatly into three categories, each reseller has their own way of working with their clients. If you are going to buy or build your own Supermicro storage server or have already done so, share your experience with your colleagues in the comments below or on your favorite forum or community site.

What About Pricing

Supermicro does not publish prices and we are not going to out them here, but we wanted to see if we could determine the street price for the Supermicro SSG-6049 system by surveying reseller websites. It was not pretty. In our research, we saw prices for the Supermicro SSG-6049 model range from $6K to 40K on different reseller sites. On the website with the $6K price they started with a fictitious base system that you could not order, and then listed the various components you were required to add, such as CPU, memory, hard drives, etc. At the $40K website the reseller didn’t bother to list any of the components; it just had the model and the price—no specs or technical information. Classic buyer beware scenarios in both cases.

The other variable that made the street price hard to determine was that resellers often bundled other services into the price of the system such as installation, annual maintenance, and even shipping. All are reasonable services for a reseller to offer, but they cloud the picture when trying to determine the actual cost of the product you are trying to buy. At best, we can say that the street price is somewhere between $20K and $30K, but we are not very confident with that range.

Storage Server Pricing Over Time

Since 2009 we have tracked the cost per GB of each Storage Pod version we have produced. We’ve updated the chart below to add both Storage Pod version 6.X, our most current Storage Pod configuration, and the Supermicro storage server we are buying, model PIO-5049.

The cost per GB is computed by taking the total hardware cost of a storage server, including the hard drives, and dividing by the total storage in the server at the time. When Storage Pod 1.0 was released in September 2009, the system cost was about $0.12/GB, and as you can see that has decreased over time to $0.02/GB in the Supermicro systems.

One point to note is that both the Storage Pod 6.X ($0.028/GB) and Supermicro ($0.020/GB) servers use the same 16TB hard drive models. We believe the difference between the cost per GB of the two cohorts ($0.008) is primarily based on the operational efficiency obtained by Supermicro in making and selling tens of thousands of units a month versus Backblaze assembling a hundred 6.X Storage Pods on our own. In other words, Supermicro’s scale of production has enabled us to get performant systems for less than if we continued to build them ourselves.

What’s Next for Storage Pods

No one here at Backblaze is ready to write the obituary for our beloved red Backblaze Storage Pods. Afterall, the innovation that was the Storage Pod created the opportunity for Backblaze to change the dynamics of the storage market. Now that the Storage Pod hardware has been commoditized, our cloud storage software platform is what enables us to continue to deliver value to businesses and individuals alike.

All that means is that our next Storage Pod probably won’t be an incremental change, but instead something completely new, at least for us. It may not even be a Storage Pod—who knows? That said, we will continue to upgrade our existing Storage Pods with new CPUs, memory, and such, and they’ll be around for years to come. At which point we may give them away or crush them (again). In the meantime, we’ll probably do another blog post or two so we can post a few pictures and tell a few stories. Or maybe we’ll just move on. Hard to say right now.

Thanks to all our Storage Pod readers for your comments and suggestions over the years. You’ve made us better along the way and we look forward to continuing to hear from you as our journey continues.

The post The Storage Pod Story: Innovation to Commodity appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

“An Ideal Solution”: Daltix’s Automated Data Lake Archive Saves $100K

Post Syndicated from Amrit Singh original https://www.backblaze.com/blog/an-ideal-solution-daltixs-automated-data-lake-archive-saves-100k/

In the fast-moving consumer goods space, Daltix is a pioneer in providing complete, transparent, and high-quality retail data. With global industry leaders like GFK and Unilever depending on their pricing, product, promotion, and location data to build go-to market strategies and make critical decisions, maintaining a reliable data ecosystem is an imperative for Daltix.

As the company has grown since its founding in 2016, the amount of data Daltix is processing has increased exponentially. They’re currently managing around 250TB, but that amount is spread across billions of files, which soon created a massive drag on time and resources. With an infrastructure built almost entirely around AWS and billions of miniscule files to manage, Daltix started to outgrow AWS’ storage options in both scalability and cost efficiency.

The Daltix team in Belgium.

We got to chat with Charlie Orford, Principal Software Engineer for Daltix, about how Datix switched to Backblaze B2 Cloud Storage and their takeaways from that process. Here are some highlights:

  • They used a custom engine to migrate billions of files from AWS S3 to Backblaze B2.
  • Monthly costs reduced by $2,500 while increasing data portability and reliability.
  • Daltix established the infrastructure to automatically back up 8.4 million data objects every day.

Read on to learn how they did it.

A Complex Data Pipeline Built Around AWS

Most of the S3-based infrastructure Daltix built in the company’s early days is still intact. Historically, the data pipeline started with web-scraped resources written directly to Amazon S3, which were then standardized by Lamba-based extractors before being sent back to S3. Then AWS Batch picked up the resources to be augmented and enriched using other data sources.

All those steps took place before the data was ready for Daltix’s team of analysts. In order to optimize the pipeline and increase efficiency, Orford started absorbing pieces of that process into Kubernetes. But there was still a data storage problem; Daltix generates about 300GB of compressed data per day, and that figure was growing rapidly. “As we’d scaled up our data collection, we’d had to sharpen our focus on cost control, data portability, and reliability,” said Orford. “They’re obvious, but at scale, they’re extremely important.”

Cost Concerns Inspire The Search For Warm Archival Storage

By 2020, Daltix had started to realize the limitations of building so much of their infrastructure in AWS. For example, heavy customization around S3 metadata made the ability to move objects entirely dependent on the target system’s compatibility with S3. Orford was also concerned about the costs of permanently storing such a huge data lake in S3. As he puts it, “It was clear that there was no need to have everything in S3 forever. If we didn’t do anything about it, our S3 costs were going to continue to rise and eventually dwarf virtually all of our other AWS costs.”

Side-by-side comparison of server costs.

Because Daltix works with billions of tiny files, using Glacier was out of the question as its pricing model is based around retrieval fees. Even using Glacier Instant Retrieval, the sheer number of files Daltix works with would have forced them to rack up an additional $200,000 in fees per year. So Daltix’s data collection team—which produces more than 85% of the company’s overall data—pushed for an alternative solution that could address a number of competing concerns:

  • The sheer size of the data lake.
  • The need to store raw resources as discrete files (which means that batching is not an option).
  • Limitations on the team’s ability to invest time and effort.
  • A desire for simplicity to guarantee the solution’s reliability.

Daltix settled on using Amazon S3 for hot storage and moving warm storage into a new archival solution, which would reduce costs while keeping priority data accessible—even if the intention is to keep files stored away. “It was important to find something that would be very easy to integrate, have a low development risk, and start meaningfully eating into our costs,” said Orford. “For us, Backblaze really ticked all the boxes.”

Initial Migration Unlocks Immediate Savings of $2,000 Per Month

Before launching into a full migration, Orford and his team tested a proof of concept (POC) to make sure the solution addressed his key priorities:

  • Making sure the huge volume of data was migrated successfully.
  • Avoiding data corruption and checking for errors with audit logs.
  • Preserving custom metadata on each individual object.

“Early on, Backblaze worked with us hand-in-hand to come up with a custom migration tool that fit all our requirements,” said Orford. “That’s what gave us the confidence to proceed.” In partnership with Flexify, Backblaze delivered a tailor-made engine to ensure that the migration process would transfer the entire data lake reliably and with object-level metadata intact. After the initial POC bucket was migrated successfully, Daltix had everything they needed to start modeling and forecasting future costs. “As soon as we started interacting with Backblaze, we stopped looking at other options,” Orford said.

In August 2021, Daltix moved a 120TB bucket of 2.2 billion objects from standard storage in S3 to Backblaze B2 cloud storage. That initial migration alone unlocked an immediate cost savings of $2,000 per month, or $24,000 per year.

A peaceful data lake.

Quadruple the Data, Direct S3 Compatibility, and $100,000 Cumulative Savings

Today, Daltix is migrating about 3.2 million data objects (approximately 70GB of data) from Amazon S3 into Backblaze B2 every day. They keep 18 months of hot data in S3, and as soon as an object reaches 18 months and one day, it becomes eligible for archiving in B2. On the rare occasions that Daltix receives requests for data outside that 18-month window, they can pull data directly from Backblaze B2 into Amazon S3 thanks to Backblaze’s S3-compatible API and ever-available data.

Daily audit logs summarize how much data has been transferred, and the entire migration process happens automatically every day. “It runs in the background, there’s nothing to manage, we have full visibility, and it’s cost effective,” Orford said. “Backblaze B2 is an ideal solution for us.”

As daily data collection increases and more data ages out of the hot storage window, Orford expects more cost reductions. Orford expects it will take about a year and a half for daily migrations to nearly triple their current levels: that means Daltix will be backing up 9 million objects (about 450GB of data) to Backblaze B2 every day. Taking that long-term view, we see incredible cost savings for Daltix by switching from Amazon S3 to Backblaze B2. “By 2023, we forecast we will have realized a cumulative saving in the region of $75,000-$100,000 on our storage spend thanks to leveraging Backblaze B2, with expected ongoing savings of at least $30,000 per year,” said Orford.

“It runs in the background, there’s nothing to manage, we have full visibility, and it’s cost effective. B2 is an ideal solution for us.” —Charlie Orford, Principal Software Engineer, Daltix

Crunch the Numbers and See for Yourself

Want to find out what your business could do with an extra $30,000 a year? Check out our Cloud Storage Pricing Calculator to see what you could save switching to Backblaze B2.

The post “An Ideal Solution”: Daltix’s Automated Data Lake Archive Saves $100K appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

How to Migrate From LTO to the Cloud

Post Syndicated from Jeremy Milk original https://www.backblaze.com/blog/how-to-migrate-from-lto-to-the-cloud/

Using Linear-Tape Open (LTO) backups has been a solid strategy used by companies with robust media libraries for a long time. The downside of LTO is, of course, the sheer volume of space dedicated to storing these vast piles of tapes, the laboriously slow process of accessing the data on them, and the fact that they can only be accessed where they’re stored—so if there’s a natural disaster or a break-in, your data is at risk. Anyone staring down a shelf sagging under the weight of years of data and picturing the extra editing bay you could put in its place is probably thinking about making a move to the cloud.

Once you have decided to migrate your data, you need a plan to move forward. The following article will give you the basic tools for migrating from LTO to the Cloud. Before we dive in, let’s talk about some of the vast benefits of migration (other than reclaiming your storage closet).

Benefits of Moving Your Data to the Cloud

Some pretty convincing benefits come with moving away from tape to cloud storage. First is the cost. Some people might think cloud storage is more expensive, but a closer crunching of the numbers proves that it actually saves you money. We’ve created a handy LTO to Cloud Storage calculator to figure out individual savings. If you’re concerned about migration/egress fees, utilizing a Universal Data Migration (UDM) service can help eliminate those costs. In addition, tape players and tapes need maintenance and eventually replacement, adding another budgetary benefit to migrating things to the cloud.

Another benefit is easy access to files. Rather than being hidden among the files on one particular tape in one particular area of one particular stack, files can be accessed, viewed and downloaded immediately from cloud storage. With many industries moving towards remote work, being able to access your files or archives from afar is increasingly important.

So much tape; so little time.

Cloud storage is also more secure than people think. Many cloud services providers offer products like Object Lock to keep files immutable (a huge concern for compliance-heavy industries like healthcare). In the case of a ransomware attack, off-site cloud storage data means that you’re safe from the threat and restore your data quickly and get back to normal.

With all those benefits, the only concern left is that anytime you make a change to your data infrastructure, you want it to be as easy as possible. Let’s walk through a typical LTO to cloud migration so you can explore how it aligns with your process.

Six Steps to Migrate from LTO to Cloud Storage (or a Hybrid Solution)

Migrating can feel like a daunting task, but breaking it down into bite-sized pieces will help a lot. Fears about data loss and team bandwidth will obviously play a factor in migration. Don’t worry: it’s much easier than you think, and the long-term benefits will outweigh the short-term migration considerations.

Follow the steps below for a seamless, successful migration.

Step One: Take Stock of Your Content

The first concern of migration: how do you ensure that all the data you need to move is there and will be there at the end of the process? Well, now is the time to take a complete content inventory. It may have been a long time since you reviewed what is stored on tape, where it is located, and if you even want to continue keeping it. You may have old, archived data that is safe to get rid of now.

In addition to an inventory, if there was ever a good time to clean out unused/unneeded files, now is the time. It’s also a good opportunity to eliminate any duplicates—that will ensure that you’re not wasting money on storage costs or time and confusion ensuring that you’re looking at the correct file.

Does data fold?

Instead of looking at it as a pain point or chore you dread, consider a content inventory as an opportunity to clean out old files, eliminate waste, and streamline your data to only what you need and want to keep. It’s like inviting Marie Kondo over to ask whether your files spark joy. It’s also a great time to reorganize your files. Consider renaming files and folders to make it easy to retrieve items once they are stored in the cloud. Bonus: this walk down memory lane might spark ideas for refreshing or repurposing old content.

Step Two: Update Your Tracking System

LTO backups involve rotating many tapes on different days and sorting them by type of data (what is stored on them) and on varying schedules. You will need to update your tracking system for your tape strategy to how you will use tape going forward. You can also formulate a plan for tracking your cloud-based backup data as well. It may be as simple as cataloging where files are located, what type of data needs to be on tape, how often they will be backed up, when files move from hot storage to archive, and so on.

Step Three: Plan for Your Migration

To ensure a successful migration, spend some time planning exactly how to execute the move. Here are a few common questions that come up:

  • Are you moving the data in phases or all at once? If you’re moving data in phases, what needs to move first and why?
  • How many personnel are you dedicating to work on the project? And what kind of support will they need from other stakeholders?
  • Are you planning on keeping any information on tape long-term (a hybrid solution)? Some companies like healthcare, government contractors, education, and accounting firms are subject to data retention and storage laws, so that might come into play here.

Document how you want to proceed so that everyone involved has their needs met. Planning ahead will help you feel like you have a good handle on things before jumping into the deep water.

Also, it’s important to evaluate your internet bandwidth and speed to ensure you don’t experience any bottlenecks. If you have to upgrade your internet package, do so before you begin migrating. Migrate using an Ethernet-connected device with a stable connection. Wi-Fi is much slower and less reliable. If you’re moving a significant amount of data at once, you may even want to consider something like Backblaze’s Fireball service.

Backblaze’s Fireball, ready to help you transfer data.

Another thing to consider is that the cloud will let you categorize and interact with your data in different ways. For example, with Backblaze B2 storage, you can create up to 1,000 buckets per account to categorize your data your way and keep files separate—how is that different from how you’re currently interacting with your data? Who will have access to your cloud storage backups? Do you need to employ Extended Version History or Object Lock to make sure that your backups aren’t unintentionally changed?

Step Four: Back Up Both Ways

For a short while, you might want to back up to both LTO and the cloud, keeping them in tandem while you ensure a smooth and successful data migration. Once all your critical files have been moved over, you can stop backing up to tape. (Unless your organization has decided that a hybrid model works for you.)

Again, keep in mind that you may want to keep some files archived on tape and stored away. It depends on your industry, compliance issues, and data infrastructure preferences.

Step Five: Execute the Migration

Now it’s time to take the plunge. You can use the Universal Data Migration (UDM) service to move your data over and absorb any egress fees. You can move your data in days, not weeks, streamlining this chore.

All roads lead to cloud.

Step Six: Review and Compare Cloud and LTO Backups

Before you stop running your backup systems concurrently (LTO and cloud), be sure to test your backups thoroughly. When you run those tests, you don’t want to just look at the files; you actually want to restore several files, just as if you’d had them deleted from your system. Run tests restoring individual files and whole folders to ensure data integrity and master the restore process. Make sure to run those tests for your servers and with files in both Mac and PC environments.

Depending on which backup solution you use, restore procedures may differ. Sometimes, working with a company that provides end-to-end backup and restore services may work well for your organization. For example, many people prefer to back up with Veeam and integrate it with Backblaze B2 Cloud Storage.

At the end of the day, cloud storage offers many benefits like secure storage, easy access, and cost-efficient backups. Once you get past the hurdle of migration, you’ll be glad you made the switch.

Let’s Talk Solutions in Person

If you’re attending the 2022 NAB Show New York, stop by the Backblaze booth for an opportunity to see how making the move from tape to the cloud could help streamline your workflow. If nothing else, you’ll get some great swag out of it! Stop by our booth or schedule a meeting to talk to the team.

The post How to Migrate From LTO to the Cloud appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

The (New) Perks of Being A Backblaze Stock Owner

Post Syndicated from James Kisner original https://www.backblaze.com/blog/the-new-perks-of-being-a-backblaze-stock-owner/

At Backblaze, we’re deeply thankful to the communities that have helped us learn and grow over the past 15 years. From our customers, to our partners, to all of our blog readers and social followers—Backblaze wouldn’t be the same without these folks. Over the years we’ve been able to thank our community with giveaways, events, and even Storage Pods. Which is why we’re excited to announce a new perks program for one of our newest communities: shareholders.

Backblaze Launches Program to Reward Investors

Since our IPO on November 11 of 2021, we’ve been buoyed by the support and commitment of individual investors and today we’re launching a program to thank them by adding some additional perks to being a BLZE stockholder.

We’ve partnered with Stockperks to launch a program that offers the following benefits to investors:

    • For current holders of 1+ shares: You’ll receive a sticker pack to outfit your laptop, car, or any other flat surface with fresh branding from Backblaze.

  • For 10 shares or more, held for at least 1 month, investors can access one of the following discounted Backblaze products:
  • For 50 shares or more, held for at least one month: You’ll receive a hat with a custom leather patch of the Backblaze logo on the front and our wordmark on the back.

For details on signing up, redeeming your perks, and other terms and conditions (see the fine print), you can download the Stockperks app here. Once you’ve created a profile, you’ll be able to start exploring the $BLZE perks program and hearing from Backblaze management as we provide investor-related updates about the firm.

About Stockperks

Stockperks is reimagining and revolutionizing how retail investors and companies connect. It’s the first multi-channel marketplace where individual investors get the perks of company ownership, companies create a community of engaged, informed and loyal individual investors, and everyone is invested in the company’s success.

Why Stockperks, and What’s Next?

From day one as a public company, we’ve tried to engage our community as deeply as possible. From inviting customers to participate in our initial public offering, gathering investor questions through the SAY Connect platform prior to earnings calls, and now, this new partnership with Stockperks—our approach reflects our customer-centric approach as a business since we were founded. Growing a public company and investing in public companies can be best viewed as long-term commitments, and our engagement with the community is the same.

We’re also pleased to share that in Q4 of this year we plan to launch our “Stocks and Storage” video blog (vlog). In the tradition of our widely-read blog, our goal with Stocks and Storage vlog is to provide useful, relevant information to our viewership. But while our blog primarily focuses on storage topics, the Stocks and Storage vlog aims to demystify financial topics (what exactly is EBITDA, anyway?) from the perspective of a newly-public technology company in Silicon Valley.

We thank you for your support and interest, and look forward to continuing our journey together on our mission to make storing, using, and protecting that data astonishingly easy.

The post The (New) Perks of Being A Backblaze Stock Owner appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

How to Download and Back Up OneDrive Data

Post Syndicated from Lora Maslenitsyna original https://www.backblaze.com/blog/how-to-download-and-back-up-onedrive-data/

On the Backblaze blog, we’ve been sharing tips for backing up and doing more with your data, especially when it comes to data digitally scattered across social media platforms. We asked what you, our readers, wanted to know about protecting your data online and you responded with this question: How do you download and back up data on Microsoft OneDrive?

OneDrive is an online file hosting service that many users take advantage of to sync the data on their computer. Although it offers 5GB of storage space for free, users pay a fee to store data that exceeds tiers of 100GB, 1TB, and 6TB. OneDrive even notes on their website that to free up space on your OneDrive account, you should download that file or folder to a location outside of your OneDrive folders, and then delete the OneDrive copy of the file or folder to reduce your storage amount. Of course, this means that the document is no longer syncing. And by doing so, you’re forced to constantly juggle the amount of data saved in OneDrive to stay under the free limit. Worse yet, the data you remove from OneDrive is no longer protected using the 3-2-1 backup method.

This guide walks you through ensuring your data on OneDrive is safely backed up and how to keep your data safe using the 3-2-1 backup strategy. So, read on to learn how to save your OneDrive data, including:

  • A step-by-step guide to accessing and downloading your data.
  • What to do with your downloaded OneDrive data to ensure it stays protected.

Back Up Everything But the Kitchen Sync

If you’re reading this blog post, you probably already know that saving your data to a sync service is not the same as backing it up. Sync and backup services are complimentary, but only a backup will save a copy of your data and keep it safe against accidental deletion, updates, a ransomware attack, and more.

To help you save your synced computer data, we’re developing a series of guides to downloading and backing up your data across different sync services. Below is a list of our other guides, and comment below to let us know what other sync services you’d like to see us cover.

How to Download Data From Microsoft OneDrive

    1. Open your OneDrive account and select the files or folders you want to download. You can select individual items by clicking the circle check box next to each item. You can also select several files at once by clicking on one file, scrolling down the list, then left-clicking while holding down the Shift key on the last item in the list you want to select. To select all of the files in a folder, click the circle to the left of the top row, or simply press CTRL + A (or COMMAND + A on a Mac).
    2. In the top menu, select Download. You can also right-click an individual file and select Download. If you choose multiple files or folders and then select Download, your browser will download a ZIP file containing all the data you selected. If you’re in a folder and you select Download without selecting any files or folders, your browser will download everything saved in that folder.
    3. Save your OneDrive data on your computer. Your browser will download your files to the Downloads folder of your computer. Select the files and save them to a permanent location. For some users, your browser may prompt you to choose the location where you want to save the download.

    Now that you’ve downloaded your OneDrive data, keep reading to find out how to ensure that data is safely backed up.

    The 3-2-1 Method in a Nutshell

    Back up your data based on these principles:

    1. Redundancy. Have several copies of your data.
    2. Geographic Distance. Have those copies in different locations.
    3. Access. Have different types of access to your backup data. A good example here: you don’t want all of your data to be connected to the internet to reduce the risk of cyberattacks. You also don’t want all copies of your data stored in your home in case of disaster or theft.

    How to Back Up OneDrive Data

    Once you have all of your OneDrive data downloaded on your computer, you’ve fulfilled the first step of the 3-2-1 backup strategy by storing your data on your local device. Next, you should make sure to follow the next steps and save your data on a secondary, external device and in a third, off-site location. Cloud storage is the one of the best options for easily securing your data off-site.

    If you’re using Backblaze Personal Backup to protect all of the data on your computer and external drives, you’re all set! Backblaze automatically and continuously backs up a copy of all of your data to the cloud.

    Another option to consider when you want to securely store your data and offload some of it from your local device is to upload your data to Backblaze B2 Cloud Storage directly. As long as you are still keeping a copy of that data on other local drives or devices, you’re still fulfilling the 3-2-1 backup method. You can learn more about the difference between using Personal Backup and B2 Cloud Storage and how to save and organize your data in cloud storage by reading this blog post.

    Read On to Get the Most Out of Backblaze and OneDrive

    Our help section is filled with useful guides on maximizing the integration of Backblaze and OneDrive. Check out our guides for Windows or Mac to learn more.

    Don’t Rely on Sync Services to Secure Your Data

    Chances are, the data you have saved in your OneDrive folders is data you want to keep. Don’t wait until you accidentally get locked out of your account or a software update wreaks havoc on your synced data. Back up your data today, and comment below to let us know what else you’d like to know about to help you keep your data safe.

The post How to Download and Back Up OneDrive Data appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Announcing: Backblaze Computer Backup v8.5

Post Syndicated from original https://www.backblaze.com/blog/announcing-backblaze-computer-backup-v8-5/

Announcing Backblaze Computer Backup 8.5! Our latest release builds on version 8.0’s speed boosts and brings with it not only a redesigned application for Mac and PC, but an improvement to our Safety Freeze feature, which prevents your backups from becoming corrupted. Here’s a brief overview of what’s new for this release:

macOS and Windows:

  • Safety Freeze enhancements: improved efficiency and reduced false-positives.
  • Visual refresh: We’ve completely refreshed both of our client apps.
  • Minor text, bug, and performance improvements: We’ve cleaned up some of the language to make things easier to understand and have tightened up some of the code.

macOS:

  • SwiftUI redesign: the macOS app has been completely redesigned from the ground up using SwiftUI.

In More Detail

Safety Freeze Enhancements

Our Safety Freeze feature is designed to protect your backups and prevent them from being corrupted if something goes wrong on your computer. Over the years we’ve updated the feature based on feedback and tried to make it more transparent to the end user. With the updates in version 8.5, we’ve added a self-healing component which attempts to fix some of the false positives that caused an erroneous Safety Freeze to occur, especially when a user is moving from one computer to another.

Visual Refresh

With the Swift redesign on macOS, we felt now would be the perfect time to also change some of the visuals in our apps. We’ve updated both of our client apps to make them better looking, simpler to use, less cluttered, and easier to understand.

SwiftUI Redesign

In preparation for macOS Ventura, we’ve rewritten the macOS app in SwiftUI. There’s nothing but good news here. This refresh helps future-proof our macOS app and also keeps the same system efficiency you know and love from Backblaze-built applications.

General Performance Improvements

Everyone’s favorite: “general bug fixes and performance improvements.” We’ve also updated and simplified a lot of our client text to go along with the visual refresh and deliver a better, easier-to-understand overall app.

Backblaze v8.5 Is Available Today: September 15, 2022

We hope you love this new release! We will be slowly auto-updating all users in the coming weeks, but if you can’t wait and want to update now on your Mac or PC:

  1. Right click on the Backblaze icon in your menu or taskbar.
  2. Select Check for Updates.
  3. Download v8.5 from the Backblaze Updates page.

Also, this version is now the default download on www.backblaze.com. Please reach out to support if you have any questions or if you want to give feedback—we always like to know how things are going.

The post Announcing: Backblaze Computer Backup v8.5 appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Lights, Camera, Custom Action (Part Two): Inside Integrating Frame.io + Backblaze B2

Post Syndicated from Pat Patterson original https://www.backblaze.com/blog/lights-camera-custom-action-part-two-inside-integrating-frame-io-backblaze-b2/

Part 2 in a series covering the Frame.io/Backblaze B2 integration, covering the implementation. See Part 1 here, which covers the UI.

In Lights, Camera, Custom Action: Integrating Frame.io with Backblaze B2, we described a custom action for the Frame.io cloud-based media asset management (MAM) platform. The custom action allows users to export assets and projects from Frame.io to Backblaze B2 Cloud Storage and import them back from Backblaze B2 to Frame.io.

The custom action is implemented as a Node.js web service using the Express framework, and its complete source code is open-sourced under the MIT license in the backblaze-frameio GitHub repository. In this blog entry we’ll focus on how we secured the solution, how we made it deployable anywhere (including to options with free bandwidth), and how you can customize it to your needs.

What is a Custom Action?

Custom Actions are a way for you to build integrations directly into Frame.io as programmable UI components. This enables event-based workflows that can be triggered by users within the app, but controlled by an external system. You create custom actions in the Frame.io Developer Site, specifying a name (shown as a menu item in the Frame.io UI), URL, and Frame.io team, among other properties. The user sees the custom action in the contextual/right-click dropdown menu available on each asset:

When the user selects the custom action menu item, Frame.io sends an HTTP POST request to the custom action URL, containing the asset’s id. For example:

{
  "action_id": "2444cccc-7777-4a11-8ddd-05aa45bb956b",
  "interaction_id": "aafa3qq2-c1f6-4111-92b2-4aa64277c33f",
  "resource": {
    "type": "asset",
    "id": "9q2e5555-3a22-44dd-888a-abbb72c3333b"
  },
  "type": "my.action"
}

The custom action can optionally respond with a JSON description of a form to gather more information from the user. For example, our custom action needs to know whether the user wishes to export or import data, so its response is:

{
  "title": "Import or Export?",
  "description": "Import from Backblaze B2, or export to Backblaze B2?",
  "fields": [
    {
      "type": "select",
      "label": "Import or Export",
      "name": "copytype",
      "options": [
        {
          "name": "Export to Backblaze B2",
          "value": "export"
        },
        {
          "name": "Import from Backblaze B2",
          "value": "import"
        }
      ]
    }
  ]
}

When the user submits the form, Frame.io sends another HTTP POST request to the custom action URL, containing the data entered by the user. The custom action can respond with a form as many times as necessary to gather the data it needs, at which point it responds with a suitable message. For example, when it has all the information it needs to export data, our custom action indicates that an asynchronous job has been initiated:

{
  "title": "Job submitted!",
  "description": "Export job submitted for asset."
}

Securing the Custom Action

When you create a custom action in the Frame.io Developer Tools, a signing key is generated for it. The custom action code uses this key to verify that the request originates from Frame.io.

When Frame.io sends a POST request, it includes the following HTTP headers:

X-Frameio-Request-Timestamp The time the custom action was triggered, in Epoch Epoch timetime (seconds since midnight UTC, Jan 1, 1970).
X-Frameio-Signature The request signature.

The timestamp can be used to prevent replay attacks; Frame.io recommends that custom actions verify that this time is within five minutes of local time. The signature is an HMAC SHA-256 hash secured with the custom action’s signing key—a secret shared exclusively between Frame.io and the custom action. If the custom action is able to correctly verify the HMAC, then we know that the request came from Frame.io (message authentication) and it has not been changed in transit (message integrity).

The process for verifying the signature is:

    • Combine the signature version (currently “v0”), timestamp, and request body, separated by colons, into a string to be signed.
    • Compute the HMAC SHA256 signature using the signing key.
    • If the computed signature and signature header are not identical, then reject the request.

The custom action’s verify TimestampAndSignature() function implements the above logic, throwing an error if the timestamp is missing, outside the accepted range, or the signature is invalid. In all cases, 403 Forbidden is returned to the caller.

Custom Action Deployment Options

The root directory of the backblaze-frameio GitHub repository contains three directories, comprising two different deployment options and a directory containing common code:

  • node-docker—generic: Node.js deployment
  • node-risingcloud: Rising Cloud deployment
  • backblaze-frameio-common: common code

The node-docker directory contains a generic Node.js implementation suitable for deployment on any Internet-addressable machine–for example, an Optimized Cloud Compute VM on Vultr. The app comprises an Express web service that handles requests from Frame.io, providing form responses to gather information from the user, and a worker task that the web service executes as a separate process to actually copy files between Frame.io and Backblaze B2.

You might be wondering why the web service doesn’t just do the work itself, rather than spinning up a separate process to do so. Well, media projects can contain dozens or even hundreds of files, containing a terabyte or more of data. If the web service were to perform the import or export, it would tie up resources and ultimately be unable to respond to Frame.io. Spinning up a dedicated worker process frees the web service to respond to new requests while the work is being done.

The downside of this approach is that you have to deploy the custom action on a machine capable of handling the peak expected load. The node-risingcloud implementation works identically to the generic Node.js app, but takes advantage of Rising Cloud’s serverless platform to scale elastically. A web service handles the form responses, then starts a task to perform the work. The difference here is that the task isn’t a process on the same machine, but a separate job running in Rising Cloud’s infrastructure. Jobs can be queued and new task instances can be started dynamically in response to rising workloads.

Note that since both Vultr and Rising Cloud are Backblaze Compute Partners, apps deployed on those platforms enjoy zero-cost downloads from Backblaze B2.

Customizing the Custom Action

We published the source code for the custom action to GitHub under the permissive MIT license. You are free to “use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software” as long as you include the copyright notice and MIT permission notice when you do so.

At present, the user must supply the name of a file when importing an asset from Backblaze B2, but it would be straightforward to add code to browse the bucket and allow the user to navigate the file tree. Similarly, it would be straightforward to extend the custom action to allow the user to import a whole tree of files based on a prefix such as raw_footage/2022-09-07. Feel free to adapt the custom action to your needs; we welcome pull requests for fixes and new features!

The post Lights, Camera, Custom Action (Part Two): Inside Integrating Frame.io + Backblaze B2 appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

The SSD Edition: 2022 Drive Stats Mid-year Review

Post Syndicated from original https://www.backblaze.com/blog/ssd-drive-stats-mid-2022-review/

Welcome to the midyear SSD edition of the Backblaze Drive Stats report. This report builds on the 2021 SSD report published previously and is based on data from the SSDs we use as storage server boot drives in our Backblaze Cloud Storage platform. We will review the quarterly and lifetime failure rates for these drives and, later in this report, we will also compare the performance of these SSDs to hard drives we also use as boot drives. Along the way, we’ll offer observations and insights to the data presented and, as always, we look forward to your questions and comments.

Overview

Boot drives in our environment do much more than boot the storage servers: they also store log files and temporary files produced by the storage server. Each day a boot drive will read, write, and delete files depending on the activity of the storage server itself. In our early storage servers, we used HDDs exclusively for boot drives. We began using SSDs in this capacity in Q4 2018. Since that time, all new storage servers, and any with failed HDD boot drives, have had SSDs installed.

Midyear SSD Results by Quarter

As of June 30, 2022, there were 2,558 SSDs in our storage servers. This compares to 2,200 SSDs we reported in our 2021 SSD report. We’ll start by presenting and discussing the quarterly data from each of the last two quarters (Q1 2022 and Q2 2022).

Notes and Observations

Form factors: All of the drives listed above are the standard 2.5” form factor, except the Dell (DELLVOSS VD) and Micron (MTFDDAV240TCB) models each of which are the M.2 form factor.

Most drives added: Since our last SSD report, ending in Q4 2021, the Crucial (model: CT250MX500SSD1) lead the way with 192 new drives added, followed by 101 new DELL drives (model: DELLBOSS VD) and 42 WDC drives (model: WDS250G2B0A).

New drive models: In Q2 2022 we added two new SSD models, both from Seagate, the 500GB model: ZA500CM10003 (3 drives), and the 250 GB model: ZA250NM1000 (18 drives). Neither has enough drives or drive days to reach any conclusions, although they each had zero failures, so nice start.

Crucial is not critical: In our previous SSD report, a few readers took exception to the high failure rate we reported for the Crucial SSD (model: CT250MX500SSD1) although we observed that it was with a very limited amount of data. Now that our Crucial drives have settled in, we’ve had no failures in either Q1 or Q2. Please call off the dogs.

One strike and you’re out: Three drives had only one failure in a given quarter, but the AFR they posted was noticeable: WDC model WDS250G2B0A – 10.93%, Micron – Model MTFDDAV240TCB – 4.52%, and the Seagate model: SSD – 3.81%. Of course if any of these models had 1 less failure their AFR would be zero, zip, bupkus, nada – you get it.

It’s all good man: For any given drive model in this cohort of SSDs, we like to see at least 100 drives and 10,000 drives-days in a given quarter as a minimum before we begin to consider the calculated AFR to be “reasonable”. That said, quarterly data can be volatile, so let’s next take a look at the data for each of these drives over their lifetime.

SSD Lifetime Annualized Failure Rates

As of the end of Q2 2022 there were 2,558 SSDs in our storage servers. The table below is based on the lifetime data for the drive models which were active as of the end of Q2 2022.

Notes and Observations

Lifetime annualized failure rate (AFR): The lifetime data is cumulative over the period noted, in this case from Q4 2018 through Q2 2022. As SSDs age, lifetime failure rates can be used to see trends over time. We’ll see how this works in the next section when we compare SSD and HDD lifetime annualized failure rates over time.

Falling failure rate?: The lifetime AFR for all of the SSDs for Q2 2022 was 0.92%. That was down from 1.04% at the end of 2021, but exactly the same as the Q2 2021 AFR of 0.92%.

Confidence Intervals: In general, the more data you have, and the more consistent that data is, the more confident you are in your predictions based on that data. For SSDs we like to see a confidence interval of 1.0% or less between the low and the high values before we are comfortable with the calculated AFR. This doesn’t mean that drive models with a confidence interval greater than 1.0% are wrong, it just means we’d like to get more data to be sure.

Speaking of Confidence Intervals: You’ll notice from the table above that the three drives with the highest lifetime annualized failure rates also have sizable confidence intervals.


Conversely, there are three drives with a confidence interval of 1% or less, as shown below:


Of these three, the Dell drive seems the best. It is a server-class drive in an M.2 form factor, but it might be out of the price range for many of us as it currently sells from Dell for $468.65. The two remaining drives are decidedly consumer focused and have the traditional SSD form factor. The Seagate model ZA250CM10003 is no longer available new, only refurbished, and the Seagate model ZA250CM10002 is currently available on Amazon for $45.00.

SSD Versus HDD Annualized Failure Rates

Last year we compared SSD and HDD failure rates when we asked: Are SSDs really more reliable than Hard Drives? At that time the answer was maybe. We now have a year’s worth of data available to help answer that question, but first, a little background to catch everyone up.

The SSDs and HDDs we are reporting on are all boot drives. They perform the same functions: booting the storage servers, recording log files, acting as temporary storage for SMART stats, and so on. In other words they perform the same tasks. As noted earlier, we used HDDs until late 2018, then switched to SSDs. This creates a situation where the two cohorts are at different places in their respective life expectancy curves.

To fairly compare the SSDs and HDDs, we controlled for average age of the two cohorts, so that SSDs that were on average one year old, were compared to HDDs that were on average one year old, and so on. The chart below shows the results through Q2 2021 as we controlled for the average age of the two cohorts.


Through Q2 2021 (Year 4 in the chart for SSDs) the SSDs followed the failure rate of the HDDs over time, albeit with a slightly lower AFR. But, it was not clear whether the failure rate of the SSD cohort would continue to follow that of the HDDs, flatten out, or fall somewhere in between.

Now that we have another year of data, the answer appears to be obvious as seen in the chart below, which is based on data through Q2 2022 data and gives us the SSD data for Year 5.

And the Winner Is…

At this point we can reasonably claim that SSDs are more reliable than HDDs, at least when used as boot drives in our environment. This supports the anecdotal stories and educated guesses made by our readers over the past year or so. Well done.

We’ll continue to collect and present the SSD data on a regular basis to confirm these findings and see what’s next. It is highly certain that the failure rate of SSDs will eventually start to rise. It is also possible that at some point the SSDs could hit the wall, perhaps when they start to reach their media wearout limits. To that point, over the coming months we’ll take a look at the SMART stats for our SSDs and see how they relate to drive failure. We also have some anecdotal information of our own that we’ll try to confirm on how far past the media wearout limits you can push an SSD. Stay tuned.

The SSD Stats Data

The data collected and analyzed for this review is available on our Hard Drive Test Data page. You’ll find SSD and HDD data in the same files and you’ll have to use the model number to locate the drives you want, as there is no field to designate a drive as SSD or HDD. You can download and use this data for free for your own purpose. All we ask are three things: 1) you cite Backblaze as the source if you use the data, 2) you accept that you are solely responsible for how you use the data, and 3) you do not sell this data to anyone—it is free.

You can also download the Backblaze Drive Stats data via SNIA IOTTA Trace Repository if desired. Same data; you’ll just need to comply with the license terms listed. Thanks to Geoff Kuenning and Manjari Senthilkumar for volunteering their time and brainpower to make this happen. Awesome work.

Good luck and let us know if you find anything interesting.

The post The SSD Edition: 2022 Drive Stats Mid-year Review appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.