When Disaster Strikes in the Cloud!
If you were near a computer yesterday between 9:30am – 3:30pm PST you probably experienced some interruption in a web service. As the dust settles from a February 28th AWS outage you better believe the whiteboard ink is flying. Many developers are rethinking their web architecture and whether they were prepared for failure. Well, we know it’s only been a week to implement the DRaaS solutions we explored in our webinar last week – so we won’t say “We told you so!” but it’s really a great topic to explore so check out the webinar from SUSE and our innovative partners at Ocean9.
In the webinar we explored a number of different scenarios to keep you covered, and new cloud-based solutions for business continuity. It’s evident from yesterday’s outage and the number of internet services affected just how much is built on AWS – and also goes to show that some companies may be rethinking the level of fault tolerance their architecture puts into place for their web applications. We’ll add two more thoughts specific to yesterday’s situation and of course open to your questions and comments on how to prepare for failure.
Availability Zones on AWS
Another key element to achieving greater fault tolerance is to distribute your application geographically. If a single Amazon Web Services datacenter fails for any reason, you can protect your application by running it simultaneously in a geographically distant datacenter. When you use AWS, you can specify the Region in which your data will be stored, instances run, queues started, and databases instantiated. With the Amazon S3 services affected by yesterday’s outage for example many customers could’ve avoided the situation by leveraging an Availability Zone or Multi-AZ architecture.
Object Storage Solutions
The Amazon S3 service has a 99.9% SLA so downtime should be planned for and one cost-effective solution for your AWS web apps is having access to an on-site S3-like storage solution such as SUSE Enterprise Storage 3.0 with an Object Storage gateway and S3 compatible APIs.
Another great element to building on AWS? So many experts sharing their learning and open-sourcing tools to help better adopt the leading cloud platform. Netflix, a very early adopter that went all-in on AWS for example teaches you how to be good at failing and maintaining availability. They have actually open sourced tools like the “Chaos Monkey” to go in and break a random service in your AWS application to see how well you’re prepared for a disaster. Read the quickstart guide.
In closing here’s three things you should do if you were affected by the AWS outage:
- Examine the cost of downtime and what can be afforded for your business.
- Consider investing in Availability Zones or a backup cloud or on-site location.
- Watch the webinar here and talk to Ocean9 and/or your local SUSE sales rep!