Experts share 5 tips to survive a cloud outage

“Everything fails, all the time,” so says CTO Werner Vogels.

Amazon Web Services itself experienced a much publicized four-day service disruption last April, another outage in August and it had plenty of company from other cloud service companies last year. Microsoft‘s Windows Azure cloud platform in February had downtime problems due after the company failed to account for Leap Day, and despite improvements by cloud providers to minimize future outages, more outages will inevitably happen this year and beyond.

Here are steps experts say enterprise IT shops should take to avoid cloud outages from knocking them out:

1) With AWS, use multiple availability zones

Amazon Web Services offers “availability zones” (AZ) in each of its regions and for each of its services. The company describes AZs as each running on its own physically distinct, independent infrastructure. “They are physically separate, such that even extremely uncommon disasters such as fires, tornados or flooding would only affect a single Availability Zone.” During last year’s outage, about 45 per cent of customers who used only a single AZ for the Relational Database Services were impacted, compared to less than 3 per cent of customers who used a multi-AZ approach, AWS said in a post mortem report. After last year’s outage the company made it easier for customers to use a multi-AZ approach by allowing common design and APIs to distribute instances across AZs.

2) With AWS, use multiple regions

AWS has a network of eight regions including: US East (Northern Virginia), US West (Oregon), US West (Northern California), EU (Ireland), Asia Pacific (Singapore), Asia Pacific (Tokyo), South America (Sao Paulo), and AWS GovCloud. For extra security and protection beyond a multi-AZ approach, users can place workloads in multiple regions. It’s not quite as easy as putting workloads in multiple AZs though, as separate APIs calls are needed for the different regions.

3) Use multiple cloud providers

Still don’t feel protected even with a multi-AZ, multi-region approach? Use multiple cloud providers then, advises Drue Reeves, a Gartner cloud analyst. This comes with caveats as well, since some service providers share common data centre resources. Reeves says customers can check with individual providers to see if they are sharing resources with any others that the customer may be using.

4) Outline availability in SLAs

Beyond taking technical measures, customers can take nontechnical steps, such as negotiating with their cloud service provider regarding service-level agreements (SLA) that specify penalties to be paid in the case of a disruption. If a customer is using a cloud provider for disaster recovery services, the SLA might mandate as much as 99.999 per cent availability.

5) If you can’t take the heat, stay away from the fire

If a user is extremely concerned about high availability of data and applications in the cloud, Steve Hendrick, an IDC analyst, says perhaps that means the customer isn’t ready for a public cloud. Hendrick says it’s a simple equation: The more mission critical the data and compute resources are, the more protections for resiliency and high availability the customer should put in place.

Network World staff writer Brandon Butler covers cloud computing and social media. He can be reached at [email protected] and found on Twitter at @BButlerNWW.

Would you recommend this article?


Thanks for taking the time to let us know what you think of this article!
We'd love to hear your opinion about this or any other story you read in our publication.

Jim Love, Chief Content Officer, IT World Canada

Featured Download

Related Tech News

Get ITBusiness Delivered

Our experienced team of journalists brings you engaging content targeted to IT professionals and line-of-business executives delivered directly to your inbox.

Featured Tech Jobs