Geo Clustering: What, How, and Why?
The following article has been contributed by Greg Eckert who is responsible for Business Development at SUSE’s valued partner LINBIT.
Disaster Recovery (DR) has been an industry buzzword for decades. But DR can have several meanings: mobile offices which act as work spaces after disasters, backups off-site, recovering data from destroyed disks, live replication between multiple sites, etc.. Altogether, the term never fully developed into a single well-defined concept except at a very high level.
So in terms of data protection, how do IT folks delineate between live replication between two different sites and periodic backups between them? SUSE has recently made this easy by defining a new industry term: “Geo Clustering”.
Geo Clustering does exactly what it says: it takes a High Availability cluster built on standard commodity hardware, and replicates the data live across long-distances. If the primary data-center fails, users can automatically fail-over services to a working data-center.
Geo Clustering technologies are host based, which means they can replicate any data that can be written to a local hard-drive, which, in SUSE’s case, includes any application that runs on Linux. This functionality also extends to VMs, physical hosts, hypervisors, and anything else that runs on standard commodity hardware. More information about the combination of software that is used to accomplish this can be found in the Technical Details section below.
SUSE’s new Geo Clustering capability for the SUSE Linux Enterprise High Availability Extension can automatically fail-over a non-working data-center over any distance. SUSE leverages a software stack that is in production in a variety of situations: over campuses with dedicated fiber, metro-area networks with 1Gb/sec connections, across countries with WAN connectivity, and even in some ships which replicate their data over satellite to shore. Unlike proprietary appliances, which have strict requirements, SUSE’s Geo Clustering software works in virtually any production scenario.
So how have users with critical data handled data-center fail-over in the past? On the very expensive end of the cost spectrum, there are traditional proprietary SANs using Array-Based replication. I wouldn’t recommend dabbling in these technologies if you’re trying to avoid vendor lock-in. In addition, the cost doesn’t stop at the hardware and licensing. Users must also pay per gigabyte transferred over long-distances for these proprietary hardware solutions. For companies with deep pockets and a preference for large vendors, these are the technologies you come across for rapidly recovering a full-site outage.
On the other end of the cost spectrum are things like LVM snapshots, which are free, but only update the DR system periodically. For users with a more relaxed Recovery Point Objectives (RPO) and Recovery Time Objectives (RTO), daily or even weekly snapshots are an alternative to live replication between SANs. The SUSE-LINBIT solution aims to bridge the gap here: tried-and-tested mission-critical capability based on open software and commodity hardware. It’s the Swiss army knife of DR: a lightweight tool, used worldwide, which can be depended on for multiple uses, and is completely certified as tested, tried, and true.
This new Geo Clustering replication functionality on Linux has the potential to influence the DR and long-distance replication market the same way that SDS software has begun to creep in on the Enterprise Storage market.
IDC has reported that the global Enterprise storage market is down 3% between 2015 and 2016. Server sales are up over the same period. The reason? Commodity hardware solutions are hitting mainstream businesses. Over the past decade, hosting providers and major ISPs used these ‘software on commodity hardware’ combinations in order to create their competitive cost advantage over other players. Now, Fortune-500 companies, governments, and even financial institutions are realizing that in order to stay cost competitive, they need to use software-defined technologies.
Linux is beating proprietary vendors in adoption and feature-functionality. It is only a matter of time before the mass-market segment catches on to the fact that servers and software can now do the same thing as their expensive enterprise storage appliances, for a fraction of the cost. It is happening in the SDS space, and now SUSE is bringing new competition to the traditional DR space.
Try it out!
LINBIT and SUSE have created a joint SUSE Best Practices paper, a technical guide, which describes how to install and test the new geo-clustering feature. If you have some fresh servers running SUSE Linux Enterprise Server and a lab, just view the guide.
If you have questions, reach out to your local SUSE representative, or, if you happen to be running this on a Linux OS that isn’t SUSE Linux Enterprise Server (shame on you!), you can contact LINBIT directly for a Geo-Clustering trial on your distro of choice, or just read our technical guide for geo clustering.
SUSE and LINBIT have collaborated on Open Source DRBD software for local HA clusters for years. In combination with Corosync and Pacemaker, users can automatically fail-over local high availability clusters. SUSE wrapped this up into their SUSE Linux Enterprise High Availability Extension and developed a GUI for users to interface with the software.
In addition to developing the well-known DRBD software, LINBIT has built a Disaster Recovery add-on called DRBD Proxy designed to replicate data over WAN environments. Because Pacemaker wasn’t designed to fail-over services in WAN environments, DR fail-overs used to be a manual process.
SUSE’ s latest project, ‘Booth’, allows Pacemaker to be used over WAN environments which enables this much-desired automatic fail-over functionality.
Now, with the combination of DRBD, DRBD Proxy, Pacemaker, and Booth, organizations can replace their proprietary WAN replication appliances with standard commodity hardware, and a completely software-based stack.