What Can We Learn From Amazon's Cloud Failure?

The unexpected Amazon server outage that hit late on April 20 has led business and individual users alike to ponder the implications of future outages. If Amazon can go down — with all of their expertise and resources — then anyone can go down. The outage caused server failures for customers relying on Amazon’s infrastructure to do business.

Aislyn Greene’s Puget Sound Business Journal article on the outage revealed that the Amazon incident affected businesses ranging from the NYTimes, to social media site Foursquare, to cloud-based company Heroku. The Amazon crash shows how businesses can be impacted by even the most dependable cloud providers. And while the server failures were, for the most part, promptly resolved, the bottom line remains that businesses were affected.

So, what can companies learn from this experience?

“It’s not a catastrophe unless something valuable (like user data) was lost,” said Cheezburger Network CEO Ben Huh as quoted in the article. “It’s an opportunity to learn about the service provider’s weakness and how to design more stable, reliable systems. Services recover very quickly from outages as long as they are relatively short. Long-term outages are another beast.”

Another takeaway from the Amazon event: Companies must understand the services their cloud provider offers, as well as storing content and having access availability from more than one location.

“As the hype around the cloud has become so loud, people forget to look under the covers,” said Margaret Dawson,VP of product management at Hubspan, a cloud-based service provider. “They’re just thinking ‘Oh I'll just throw my storage up there, I’m just going to run this application,’ and they really need to do due diligence around the company running the application, (and ask) do they also run the infrastructure or run the data center?”

The bottom line? Businesses and individuals carefully assess how the cloud can be useful to them while building redundancy into their applications.

For more information about Amazon’s recent server failures and the need for an effective cloud provider, read the full article: http://www.techflash.com/seattle/2011/04/Amazon-server-failure-highlights-problem.html

And for added coverage on the Amazon incident, read “Will Amazon’s recent server failures slow the rise of cloud computing?” at http://www.slate.com/id/2292228/