More Microsoft BPOS and Amazon cloud outage woes

By Richi Jennings (@richi ). The Dublin datacentre used by Amazon (NASDAQ:AMZN) and Microsoft (NASDAQ:MSFT) for cloud services got hit by lightning last night. The incident sparked hours and hours of downtime for BPOS and EC2/EBS/S3....


By (@richi ).

The Dublin datacentre used by Amazon (NASDAQ:AMZN) and Microsoft (NASDAQ:MSFT) for cloud services got hit by lightning last night. The incident sparked hours and hours of downtime for BPOS and EC2/EBS/S3. For some customers, it's still not fixed, which isn't building much confidence in cloud-based solutions.

  • On the one hand, what a rotten record of downtime these two cloud services are having.
  • On The Other Hand, if it was your datacentre, would you do any better?

Plus, today's skateboarding duck: a Pilot's eye view of a three-day trip from Boston to Paris...

Mikael Ricknäs rëpörts:
Lightning struck a transformer, sparking an explosion [at 6.41pm BST, Sunday]. ... Under normal circumstances, backup generators would seamlessly kick in, but the explosion also...knock[ed] out some of those.
By [9:56pm BST], power to the majority of network devices had been restored, allowing Amazon to focus on bringing EC2...instances and EBS (Elastic Block Storage) volumes back online. ... To speed up the recovery process, Amazon started adding more EBS capacity.
European customers of Microsoft's Business Productivity Online...Suite were also affected by the power outage. But services were restored...[after seven hours] a spokesman said. more.png

Paul Kunert thinks downtime is rare for Amazon, but common for Microsoft:
Instances of downtime in the world of BPOS have been more commonplace, with a summer of interruptions for users...starting back in May and continuing in June. ... Redmond has come under criticism for the number of service interruptions...and this latest incident will cause a few more blushes. more.png

Nick Farrell thinks it's time for a colourful metaphor:
One would think that one of the advantages of the Cloud is that...electrical load would be seamlessly picked up by...generators. But it turned out that there were more seams than a Greek Wedding dress.
[S]omething else broke and automatic systems had to be brought online using humans and lots of hamsters running around in millions of small wheels.  more.png

Gareth Halfacree thinks strategically:
Dublin...plays host to major data centres for a range for the European market thanks to a cool climate and friendly tax breaks. Amazon has been the most forthcoming about the outage...[w]hile Microsoft has been tight-lipped on the exact nature.
While the idea of a flexible instance-based cloud computing model should, in theory, increase availability...the act of passing off responsibility to a cloud computing provider does leave businesses with less overall control.  more.png

Phil Wainewright explains why there was no fallback in many cases:
The outage struck servers in one of three availability zones in the EU-WEST-1 region, but recovery efforts have had knock-on the other two zones. ... EU-WEST-1 is Amazon’s only data center in Europe, [so] customers who have to keep their data within the European region for data protection compliance have no available failover.
In what seems to be a typical pattern...customers have been complaining of insufficient information coming out to help them recover.  more.png

Today's Skateboarding Duck...

Don't miss out on OTOH:

Richi Jennings Richi Jennings is an independent analyst/consultant, specializing in blogging, email, and security. His writing has previously won American Society of Business Publication Editors and Jesse H. Neal awards. A cross-functional IT geek since 1985, you can also read Richi's full profile and disclosure of his industry affiliations.

"Recommended For You"

Amazon, cloud and the problem of automation Amazon storage failure brings down Reddit, Imgur, others