Cloud Bashing: Some Perspective

So a massive, deadly storm rips through Northern Virginia, Washing D.C., and Maryland, knocking out power for millions of people, but what is everyone upset about?  The couldn’t watch some Netflix movies or post their dinner picture to Instagram on Friday night. At least that seems to be the common theme in various tweets and blog posts that I’ve read over the weekend.

Why is everyone so upset?  Most of you have probably never heard of AWS.  Services like Netflix, Pinterest, and Instagram are all powered by AWS servers and were down for several hours Friday night into Saturday morning because of issues in Amazon’s Norther-Virginia data center, which was affected by the storm.

Amazon provides multiple data centers throughout the world for AWS customers.  They advise their customers to load their services into multiple data centers specifically so that they are available when one data center is offline – and many do this, including the aforementioned services – but the services still had issues on Friday night because of a failure in Amazon’s Elastic Load Balancing (ELB) service.  The ELB service is the one that makes spreading a web site over multiple data centers possible – and it failed, this time.

It is important to note, as Wired did, that there are regular outages for Amazon data centers and these services – and many others – continue to operate without any issue.

This was a fluke.  Could Amazon have done more to ensure that it didn’t happen?  Maybe.  Was it a major inconvenience for people who wanted to watch movies and post photos?  Sure.  Do accidents happen no matter how careful we are?  Absolutely.  Is it the end of the world, cloud computing, or Amazon?  Heck no.

Put it into perspective people.  There are still millions of people without power this morning (Monday) – the storm ripped through the area on Friday night.  Many will probably be without power all week.

Sorry you couldn’t watch your movies for an evening.

</rant>

2 Comments


  1. I agree completely. It’s not uncommon to hear other IT folks quick to jump on the “See! The cloud’s unreliable!” bandwagon… as if typical organizations have uptime even close to that of most cloud services.

    Reply

  2. Exactly! But most people – to their own sanity – have little idea what goes into 99.999% uptime. That’s only 8.76 hours PER YEAR that a service can be offline. If Netflix or Instagram could pull that off by running their own data centers for the price that they’re paying Amazon, they should bottle up that secret and sell it.

    As I eluded to, Amazon data centers have failovers on a regular basis and they strongly urge customers to make their services available in multiple zones. It’s the nature of the beast.

    Whatever was the root cause of the ELB failure – I hope that Amazon learns from it and makes their service better. In reality, situations like this are the only way that you can truly learn the weaknesses of the technology and make it better.

    If this was the one hard failure for this year – similar to the failure in April 2011 – I say KUDOS to Amazon.

    Reply

Leave a Reply