What Is Cold Data Storage? | Storing Cold Data in the Cloud | ESF (original) (raw)

Cloud computing.

Enterprise Storage Forum content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

Efficient storage management includes migrating aging data through progressively inexpensive storage tiers. When data ends its migration at the cold storage stage, you can keep it for long periods of time at low cost.

Cloud-based data storage generally falls into four storage classes or tiers:

Cold Storage Use Cases

The biggest reason for using cold storage is saving money by reducing use of hot, warm, and cool storage tiers. Cold storage provides efficient and infinitely scalable capacity at a lower cost than any other storage tier.

For example, the healthcare industry produces massive amounts of medical images with retention requirements in the decades. The financial industry also has steep retention requirements, in some cases up to 30 years. Many financial institutions have stored this data in tape vaults for many years, but restoring massive data sets from tape is expensive. Cold storage in the cloud retains data for long periods, and restoring the data does not require original tape drives.

Litigation and regulatory investigations are also cold storage usage cases. For example, a retail chain might store massive amounts of backup on the cloud. One day the company receives a lawsuit from a customer who slipped and fell in a store seven months ago. The business will need to search through their backup for relevant data, collect it, analyze it and provide it to the reviewers within a few weeks. This is far simpler to do on cold storage in the cloud than from massive tape collections.

A third scenario is preserving raw data for analytics and secondary applications. Massive data sets are very expensive to keep on hot or warm storage systems. Cold storage tiers keep the raw data available for occasional access at a very low cost.

In healthcare, finance, and law, cold data storage can help companies comply with data retention regulations. Health record retention varies among states, but typically the law requires at least a few years after a patient’s discharge or death. If a patient has been released permanently from the hospital but their healthcare records still must be retained for years, cold data storage is the least expensive option. It’s designed for infrequently accessed data.

For healthcare (or financial or legal) data that probably won’t be accessed more than once or twice a year, deep archive storage (such as Amazon Glacier Deep Archive) is the least expensive option. Glacier Deep Archive costs $0.00099 GB/month.

Cold Storage and the Public Cloud

For many companies, cold storage in the cloud offers distinct advantages over on-premise nearline storage or tape vaulting. The public clouds are ramping up their cold storage in response. Amazon Glacier and the new Google Cloud Storage Coldline are dedicated to long-term cold storage. Azure uses its Cool Blob Storage to serve both cool and cold tiers.

The three services have a lot in common. Storage pricing is very similar. Amazon and Google both charge 0.004permonthlystoredgigabyte.Azurecharges0.004 per monthly stored gigabyte. Azure charges 0.004permonthlystoredgigabyte.Azurecharges0.01 per gigabyte for its cool blob storage for objects. Data access and recovery are more expensive than simple storage, which protects the public clouds against customers using cold storage as a cheap active data tier.

Durability is critical for all three services. Both Glacier and Coldline clock their durability in 11 nines (99.999999999 percent). Both services achieve this availability level by redundantly storing data across multiple domains, storage systems, and disks. Azure logs 11 nines for locally redundant storage and 12 nines for zone-redundant storage.

Recovery service levels differ somewhat between the three. For example, Amazon Glacier offers different service levels for restore times that range from minutes to hours while Google Coldline and Azure Cool Blob Storage offer fast recovery in milliseconds. Not everyone needs to recover cold data storage in such a short amount of time, but if you do, then the much shorter access time could prove very handy.

Data transfer times are important to uploading data as well as retrieving it. Whether you back up first to the cloud or keep backup copies on-site and then back them up, you need cloud transfers to stay within backup windows. The most efficient way to do this is to choose a backup product that backs up incremental changes and rehydrates them into a full restore. Also, look for backup providers who can accelerate cloud transfers between the on-premise data center and cold storage tiers.

Cold Cloud Backup Vendors

All backup products back data up to the cloud as a target, but not all of them optimize backup to cold storage tiers. Typical features for this level of integration include policy-based backup and archiving to the cold storage tier, indexing the tier for faster search and recovery, and offering flexible site choices when recovering data.

The Benefits of Cold Cloud Storage

The need for stored data continues to increase, and businesses must retain much of it for compliance, analytics, and research purposes. Keeping all this data on costly storage tiers is extremely expensive, both in capital and operating costs.

In the past, tape was the solution to cold storage requirements. But massive data volumes and the need to quickly access data for recovery or analytics have outstripped tape’s effectiveness.

This is why the cloud is deservingly popular for storing cold data — and why public cloud vendors have stepped up with cold storage services. IT must still perform due diligence to investigate which cold storage tiers are optimal for their needs and which backup vendors optimize cloud-based cold storage tiers. Although this research will take some time and energy, the cost and durability benefits of cloud-based cold storage are more than worth it.