At SUSE, we have been looking at the next level of storage solutions that provide a suitable foundation for the Cloud and for Big Data. While centralized storage will continue to play a major role in these areas, we are seeing an emerging need for massively scalable, distributed, replicated and yet inexpensive storage.
For Cloud computing, where the number of nodes in a storage domain can easily reach four-digit figures, scalability is obviously a must – and with data being “the customer’s gold”, redundancy and replication are at least as important.
With Big Data, there is a growing desire to take the analysis job to the data, rather than shovelling data from the SAN to a compute node through a pipe that’s never wide enough. Having a storage solution that lets you run applications on the same node as your data would be a significant benefit.
One of the top contenders in this league is Ceph, and if you attended SUSECon, you have probably noticed that we had several presentations covering it.
For those not familiar with it, here it is in a nutshell: Ceph is a parallel distributed storage solution pioneered by Inktank. It is designed to run on off-the-shelf server machines, using their local storage to form a redundant and highly scalable cluster of storage nodes. It provides a high degree of flexibility in defining your storage topology and the number of replicas to maintain – so that you can mirror your data across racks, server rooms and fire containment areas. Beyond that, it offers storage services through the S3 API, making it easy to integrate into existing and emerging Cloud infrastructures.
In SUSE Engineering, we have been watching (and contributing to) Ceph for some time now.
Among other activities, we teamed up with a major Hardware partner of ours to run scalability and performance tests on a large cluster of off-the-shelf servers. We were mainly focused on Ceph’s underlying layer of distributed object storage devices called RADOS (There is also a distributed file system layered on top of this, which we also tested but did not run scalability tests on).
The results were pretty impressive – throughput was excellent, and we found that it scaled well with the number of storage nodes. And even though as a project, Ceph is still pretty young, we found the overall implementation to be robust – in fact it was so successful in dealing with failed storage devices that it took us a day or so to notice that a disk in one of the servers had died.
This convinced us of Ceph’s design and implementation. We believe it has a huge potential in the area of Cloud Storage and Big Data, and we decided to include it as a Technology Preview in SUSE Cloud 1.0, released earlier this month.
As always, your feedback and opinion are very welcome!