Amazon Glacier Bids To Solve The 100 Year Archive Problem

This week, Amazon announced a new storage offering within Amazon Web Services (AWS) called Glacier.  Aptly named, it’s intended to be vast and slow moving, with a cheap price tag to match.  At a fraction of the cost of the storage intended for online storage offerings EBS and S3, Glacier will cost you $.01 per GB per month, compared to around $.05- $.13 per GB per month for higher performance offerings depending on capacity stored.  Restores from Glacier are costly by design; this is intended for data that you’re not likely to access frequently.  If used for the right types of data, this will be a low cost way to park stale data for long periods of time.

Analyzing the cost implications, it would cost you all of $120 to store a TB of data for a year, provided you don’t have to access it during that time.  Ten years would cost you $1,200, and 100 years would cost you $12,000.  Sure there would be upcharges if and when you access the data, but the value of being able to get back the data you need within a few hours, years after you archived it is tremendous.  The data reliability guarantee is 11 9’s— meaning that for each piece of data, Amazon guarantees that it will be there 99.999999999% of the time, included in the base cost to archive it,  which is about as close to certainty as you can get in any contract.

The value here isn’t in the pricing, which is attractive, it’s in the fact that a big company with good financial stability is willing to step up and manage the process of keeping data essentially forever, which is no small feat.  Sure it is cheaper to buy a cheap SATA disk drive or a tape cartridge but that doesn’t include redundancy, and it requires you to worry about migrating your data from platform to platform over the course of the 10 or 20 or 100 years that you hope to keep the data.  Amazon is promising to do that work for you, reliably, which is the real crux of this offering.  The guarantee that the data will be there years from now regardless of technology changes, disasters, people retiring, processes changing; that’s priceless.

Standard archive considerations apply here— is your data truly ready for slow access?  How good are the indexing and search capabilities?  Will you find the needle in the haystack later?  A network of software partners that can help determine which data should move to this archive tier, and help find data that’s been sent will be helpful, but Amazon has been doing a good job of building out a network of partners, so it seems likely that this will come with time.  All in all, this is a bold move by an industry leader whose offering gets richer and richer over time, while legacy players like IBM, HP, Dell and others continue to dither when it comes to cloud offerings that meet real world needs at reasonable price points.