Seagate's Kinetic Will Impact Object Storage And Data Driven Applications

Seagate's recent Kinetic Open Storage platform unveiling is making hard drive based technology interesting again.  The Kinetic platform essentially turns hard drives into individual key value stores, and allows applications and hosts to directly access Kinetic drives over TCP/IP networks.  Processing power within the drives is used to run the key value store, and the Kinetic technology also facilitates policy based drive-to-drive data migration.  If this storage architecture is commercially successful it will be extraordinarly disruptive since the direct connectivity from drives to applications will eliminate storage controllers, file systems, SANs and even RAID from the storage data path.  Developer kits for Kinetic are available today, though Seagate will not be making the drives generally available until 2014.  I'll be publishing a more in-depth report for Forrester clients on our site in the future, but for now there are a number of key points to be aware of as this technology ramps up:

Kinetic simplifes the storage stack.  Object storage and distributed applications which use key value stores are actually held back by the conventional file system, RAID and controller architectures that are found in enteprise storage systems.  In addition to storing keys and their associated data values, Kinetic manages drive-to-drive data migration to offload data mirroring and data layout chores from applications, and eliminate the need for dedicated storage controllers.  Applications using Kinect will send and request data over TCP/IP networks using Seagate's open source object API, and will allow Kinetic drives to do the heavy storage lifting (reading, writing, deleting and mirroring objects) in the background.

Cloud giants are driving Kinetic forward.  Industry leaders such as Yahoo, Rackspace & a number of additional cloud players (who have not been publically revealed) are currently evaluating the technology and could use it as the storage repository for large scale, data-centric applications and cloud storage services.  The potential cost savings that comes from eliminating storage controllers or commodity servers running the provider's proprietary storage stack would appeal to players looking to reduce power and rackspace consumption.  SwiftStack and Basho were the first two object storage players to announce their support for the Kinetic architecture, though we expect to see additional vendors in the archive and object storage space using Kinetic in the future. 

Focusing on the enterprise infrastructure, there are a few areas to consider and monitor as the Kinetic technology matures and evolves:

Enterprise storage impact.  The key value stores in Kinetic hard drives are a good fit for object storage, distributed file systems (Hadoop Distributed File System, Lustre, GlusterFS) and distributed database (Cassandra) use cases, but will not be a complete replacement for existing enterprise NAS and SAN infrastructures.  Structured data and transaction heavy workloads, which benefit from caching and other storage capabilities are not a good match for Kinetic and will continue to utilize SAN and NAS storage.

Network impact.  Unlike conventional hard drives which connect to systems using SAS, SATA or Fibre Channel interfaces, Kinetic drives will each have two Gigabit Ethernet ports to connect to TCP/IP networks.  In Q4 FY2013 Seagate shipped over 8 million enterprise drives in a single quarter.  If Kinetic becomes a significant portion of Seagate's drive mix, there could eventually be millions of drives - each requiring IP addresses and network connectivity - entering enterprise and service provider environments.  We were a little surprised that there weren't more network switching and management partners announced with Kinect's unveiling. 

Security impact.  Kinetic drives will be protected with data at rest encryption.  Seagate also claims Kinetic will be more secure that conventional storage systems since its interface library will include modules for authentication, authorization and transport layer security.  Full end-to-end integrity checking is also available and will be a key feature for warding off silient data corruption in petabyte and exabyte scale environments. 

Maintenance impact.  Kinetic is moving data placement and data management (mirroring of objects) tasks to the hard drive level.  Within infrastructures, hard drives are components with relatively high failure rates.  Drive firmware upgrades and large scale replacements of drives (to update capacity and expire old units) could become major maintenance challenges for enterprises.  

Business technology resiliency impact.  Early adopters will have to think long and hard about how they should protect their data, factoring in how many copies of objects they should keep and where those objects are stored.  While replication within a data center should be relatively straight forward, cross data center replication and data distribution have not been fleshed out in the documentation I have seen so far.  

Comments

Serious - this is way out in La la land

Huh? Did you read that and say the first pargraph is exciting, then - boom.. all down hill.

I can tell you that I understand every word that you type in this letter and in fact if you wanted me to tell you about consensus theory I could have the talk with you. The rest of the world does not understand this stuff and and if you want to talk to a bigger world, you need to show dashboards and reporting. Install a EXE on the Linux node, hookup some network and go. Whats all this talk about RAID and controller architectures , do they tell you to write this stuff.

Also think about it. two Gigabit Ethernet ports? Why would I want to bolt 2 GB cables so a drive and hope that it works without crappy latency. I see this as a system for really long term storage or really slow map reduce process. I just do not see a reason that we need to build abstraction at the drive level. I would put my money in another direction if I was Seagate.

20 cents. thanks :)

@garyjaybrooks

Who are the switch makes working with Seagate on Kinetic

Well, as someone who has been following object storage vendors and their technology for several years, Seagate's Kinetic drive announcement is the first really "new" thing I've come across. Most of the hardware platforms for object storage are based on "industry standard" storage server technology, which Kinetic may sweep away in one stroke. I second Mr. Baltazar when he questions why haven't we heard much from switch makers, like Quanta and Pica for example, when it comes to Seagate's Kinetic drives. Presumably Kinetic drives will come in a JBOD chassis with a way to easily plug all of the Kinetic drive GbE interfaces into a switch located in the rack. Maybe switches running the Cumulus Network OS (Linux) will be the perfect match for Kinetic drives? The future of Web-scale object storage is looking more interesting all the time.

Thanks for your comments

Thanks for your comments Gary. As I noted in my blog, this technology is not a replacement for enterprise NAS or SAN architectures. Kinetic will impact public clouds and large scale, data driven applications. As you noted, it will also be a potential fit for long term storage.

Seagate is not the only vendor looking to circumvent the storage stack by adding a key value store to drives and media. There are a couple of vendors which have suggested adding a key value store to flash devices in the future. Some are also thinking about using RDMA over ethernet as a low latency interconnect between these object storage devices and applications.

I agree that there should be dashboards and reporting tools to keep track of the infrastructure. The management infrastructure to support Kinetic is not going to built for this technology overnight.

Like Tim, I am still hoping that networking vendors will get involved since this will be a key factor which will determine the scalability, reliability and manageability of this architecture.