What’s the Diff: Hot and Cold Data Storage

Post Syndicated from Peter Cohen original https://www.backblaze.com/blog/whats-the-diff-hot-and-cold-data-storage/

Temperature Cold - Data Delay: Seconds to Hours vs. Temperature Hot - Data Delay: None

It’s been common to use temperature terminology, specifically a range from cold to hot, to describe the levels of tiered service available to data storage customers. The levels have been differentiated according to how crucial to current business the stored data is and how frequently it will be accessed. These terms likely originated according to where the data was historically stored: hot data was close to the heat of the spinning drives and the CPUs, and cold data was on tape or a drive far away from the data center floor.

There are no standard industry definitions of what hot and cold mean when applied to data storage, so you’ll find them used in different ways, which makes comparing services challenging. Generally, though, hot data requires the fastest and most expensive storage because it’s accessed more frequently and cold (or cooler) data that is accessed less frequently can be stored on slower, and consequently, less expensive media.

The terms are still used by the major storage vendors to describe their tiered storage plans. Below, we’ll get into why these terms have become less useful for anticipating both storage cost and performance thanks to the advent of less expensive and more efficient storage offerings, such as hot cloud storage, that effectively offer hot storage performance at cold storage prices.

Defining Hot Storage

Hot storage is data that needs to be accessed right away. If the stored information is business-critical and you can’t wait for it when you need it, that’s a candidate for hot storage.

To obtain the fast data access required for hot data storage, the data is commonly stored in hybrid or tiered storage environments. The hotter the service, the more likely that it will use the latest drives, fastest transport protocols, and be located near to the client or in multiple regions as needed.

Cloud data storage providers charge a premium for hot data storage because it’s resource-intensive. Microsoft’s Azure Hot Blobs and Amazon AWS services don’t come cheap.

Data stored in the hottest tier might use solid-state drives, which are optimized for lower latency and higher transactional rates compared to traditional hard drives. In other cases, hard disk drives are more suitable for environments where the drive is heavily accessed due to their higher durability standing up to intensive read/write cycles.

No matter the storage media used, the workloads in hot data storage require fast and consistent response times. Some examples of the uses for this type of storage would be interactive video editing, web content, online transactions and the like. Hot storage services also are tailored for workloads with many small transactions, such as capturing telemetry data, messaging, and data transformation.

Defining Cold Storage

On the other end of the thermometer, cold (or cooler) data is data that is accessed less frequently and also doesn’t require the fast access of warmer data. That includes data that is no longer in active use and might not be needed for months, years, decades, or maybe never. Practical examples of data suitable for cold storage include old projects, records needed to be maintained for financial, legal, HR, or other business record keeping requirements, or anything else that’s of value but not needed anytime soon.

Cold data is usually stored on lower performing and less expensive storage environments in-house or in the cloud. Tape has been a popular storage medium for cold data. LTO, Linear Tape-Open, was originally developed in the late 1990s as a low-cost storage option. To review data from LTO, the tapes must be physically retrieved from storage racks and mounted in a tape reading machine, making it one of the slowest, therefore coldest, methods of storing data.

Data retrieval and response time for cold cloud storage systems are typically much slower than services designed for active data manipulation. Practical examples of cold cloud storage include services like Amazon Glacier and Google Coldline.

Storage prices for cold cloud storage systems are typically lower than warm or hot storage, but cold storage often incur higher per-operation costs than other kinds of cloud storage. Access to the data typically requires patience and planning.

Today, cold storage also can be used to describe purely offline storage — that is, data that’s not stored in the cloud at all, so sometimes when you hear about cold storage it is the old definition of cold storage: data that is archived on some sort of durable medium and stored in a secure offsite facility without a connection to a network. This could be data that needs to be quarantined from the internet altogether (also called air-gapped) — for example, cryptocurrencies such as Bitcoin. (See our post, Securing Your Cryptocurrency, for more information on this topic.)

Traditional Views of Cold and Hot Data Storage
Cold cloud storageHot cloud storage
Access SpeedSlowFast
Access FrequencySeldom or NeverFrequent
Data VolumeLowHigh
Storage MediaSlower drives, SAN, tape, LTO, offlineFaster drives, durable drives, SSDs

What is Hot Cloud Storage?

With the advent of storage services that combine high speed, availability, and low cost, differentiating between cold and hot storage has become more difficult. While structuring cloud data storage by temperature has been commonly used by the big, established cloud storage providers to describe their tiered storage services and set pricing accordingly, today there are other choices, including hot cloud storage, that cross the old boundaries to provide storage that is at the same time fast, available, and inexpensive.

The big providers of cloud storage — Amazon, Microsoft, Google — have been challenged by new players in data storage, who, through innovation and efficiency, are able to offer cloud storage at the cost of cold storage, but with the performance and availability of hot storage.

Services like our own B2 Cloud Storage fall into this category. They can compete on price with LTO and other traditionally cold storage services, but can be used for applications that are usually reserved for hot storage, such as media management, workflow collaboration, websites, and data retrieval.

The new model is so effective and efficient that customers have found it economical to migrate away altogether to cloud storage from slow and inconvenient cold storage and archival systems. This trend is continuing, so it will be interesting to see what happens to the traditional temperature terms as the boundaries between hot and cold blur due to new efficiencies, technologies, and services.

What Temperature Is Your Cloud Storage?

Organizations will vary in their needs so they’ll have different approaches to the question of where to store their data. It’s imperative to an organization’s bottom line that they don’t pay for more than what they need.

Have a different idea of what hot and cold storage are? Have questions that aren’t answered here? Join the discussion in the comments.

•  •  •

If you’d like to experience the latest in hot cloud storage at cold storage prices, you can give B2 a try. Get started today and you’ll get the first 10GB free!

Note: This post was updated from March 7, 2017. — Editor

The post What’s the Diff: Hot and Cold Data Storage appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.