Hybrid cloud, OpenStack and open source storage: three essential jigsaw pieces for the enterprise of the future.
Many―if not most―major enterprises are experiencing enormous increases in the demand for storage and computing power. Few―if any―will have the budget to meet rising requirements that continue to outpace the growth in their budgets.
This raises a difficult question for IT teams everywhere: how long is the usual approach of managing the install, upgrade, retire and replace cycle going to work? By now it should be obvious to all that the strategy that the built the data center of the past isn’t going to deliver the data center of the future. New models and approaches are being embraced by the hyperscalers, based on open source software and commodity hardware. Cloud, we are told, has made IT a utility―as simple and as easy to manage as your gas bill. Yet, while we all know there are many advantages to paying by OpEx over CapEx, over time cloud can mean paying more―just in smaller instalments.
As the changes come through there is considerable risk for IT teams, who will need to be at pains to squeeze every penny from existing investment, make sound choices with new ones and wisely navigate the gap between vendor marketing messages, analyst hype and reality.
In this foggy world, some things are crystal clear. Here are three:
1. Outside of the “hyperscalers” hardly anyone will be able to afford to own and host all their compute power on premise. In the future a proportion of your compute power is going to be in public clouds, one way or another, sooner or later.
2. Storage growth is massive and unsustainable. You are going to need to find a better, cheaper way of doing it, and that way is going to need to work in harmony with your compute decisions.
3. Vendor lock-in is never a good idea. In a world where business models change, discovering you’re locked into a cloud provider might well be one of the most unpleasant discoveries of your life.
Three things you can do about it:
#1. Hybrid Cloud.
Analysts IDC have named hybrid cloud one of the biggest IT trends for 2015, forecasting that by the end of the year more than 65 percent of enterprises worldwide will commit to hybrid clouds.
Hybrid clouds provide close connectivity between physical and virtual systems inside the enterprise and those provided by the likes of Amazon, Google, and Microsoft Azure. Integrating public and private clouds allows data, services and workloads to be moved at the flick of a switch, with the administrator able to monitor and manage the whole setup via a single pane of glass. Sensitive data like corporate IP can be kept inside the company firewall, and the enterprise can access additional processing power during seasonal peaks like Christmas without the expense of massively scaling up hardware. If you’re moving towards or deploying big data analytics, or your organization experiences any kind of seasonality, you are unlikely to have the necessary power in-house in the long term, and so you’re going to go hybrid.
However, while the concept is easy to understand, cloud computing platforms often don’t interoperate well, and moving data from one proprietary cloud to another or from a private cloud to a public cloud can be a surprisingly difficult and expensive process: Amazon Glacier looks like the ultimate cold store, and the eye-catching promise of $0.01 per GB is absolutely correct, but when you factor in the bandwidth charges, then should you wish to retrieve or move that data, the attraction fades.
#2 Investigate how OpenStack can help you avoid lock-in
If you’re going to avoid vendor lock-in, you are going to need to be able to move data from one provider to another and seamlessly integrate public and private environments. If you are going to compare prices between different providers and work with the partner providing the best mix of service and price, then you are going to need open standards, or you will be locked in without an exit plan. OpenStack meets these requirements.
Originally created by Rackspace and NASA in 2010, OpenStack is an open source cloud software platform supported by hundreds of vendors, including some of the biggest names in tech: HP, Intel, Dell, and IBM to name a few. The wide support has resulted in open APIs designed to be as platform-agnostic as possible, scaling over a multitude of different environments. OpenStack is compatible with public cloud offerings from Amazon EC2 and S –and (with a little effort) AWS and Google Compute Engine. This is why it’s in use with firms like EBay, PayPal, Cisco, and the famous CERN research center. It makes sense to manage your clouds with OpenStack.
Ceph began life in 2004 as the brain child of Sage Weill: a college dissertation in support of his PhD in Computer Science at the University of California, Santa Cruz. On completing his master’s degree Weill had started his own hosting company, but had struggled with the cost of proprietary storage. With the support of a grant from the US Department of Energy, Weill went back to school at the University of California and set out to create his own storage platform, a platform with no single point of failure, self-healing, with replication to make it highly fault tolerant, and scalable to the exabyte level. Over the next decade, through a number of commercial iterations and with support from the open source community, Weill succeeded.
At the heart of Ceph are CRUSH and RADOS.
CRUSH: Just as with any distributed file system, files placed into a Ceph cluster are “striped” so that consecutive segments are stored on different physical storage nodes using CRUSH – Controlled Replication Under Scalable Hashing a hash-based algorithm that calculates how and where to store and retrieve data. CRUSH allows clients to communicate directly with storage devices without a central dictionary or index server to manage object locations. It thus enables Ceph clusters to both store and retrieve data very quickly and access more data concurrently, thereby improving throughput.
RADOS – file block and object storage in a single platform: The Reliable Autonomic Distributed Object Store provides applications with object, bloc, and file system storage in a sing unified storage cluster. This makes Ceph flexible, highly reliable and easy to manage. RADOS enables vast scalability―thousands of client hosts or KVMs accessing petabytes to exabytes of data. All applications can use the object, block or file system interfaces to the same RADOS cluster simultaneously, meaning Ceph storage systems can serve as a flexible foundation for all of your data storage needs.
Build your own: Using Ceph, you use “white boxes”―whatever commodity x86 hardware you choose (or even your older end-of-life storage arrays). Because you are free to use commodity hardware and whatever you have to hand, you can avoid being locked into proprietary platforms and all the costs they entail: choosing software defined storage on Ceph can generate cost savings of up to 50 percent― and for today’s hard pressed storage administrator, that’s something that has to be investigated.