Twitter blames two-hour failure on dual data centre crashes

Twitter blames two-hour failure on dual data centre crashes

Two parallel, redundant servers failed at about the same time, the company says

Twitter said yesterday's outage that lasted as long as two hours for some users was caused by separate data centres failing at nearly the same time.

Twitter went down between around 4.30pm BST and was back in action by about 6.25pm, according to Mazen Rawashdeh, vice president of engineering. Though some users suspected an overload of Tweets related to the Olympic Games, which opens on today in London, that was not the cause of the outage.

Instead, two data centres that operate in parallel for redundancy both failed, in what Rawashdeh called an "infrastructural double whammy."

"What was noteworthy about the outage was the coincidental failure of two parallel systems at nearly the same time," Rawashdeh said. "We are investing aggressively in our systems to avoid this situation in the future."

It was Twitter's second outage in about six weeks. On June 21, the microblogging service went down for an hour and started to come back, only to fail again before full recovery. The company blamed that outage on a cascading bug, a type of problem that spreads from one software element to others.

Comments

Advertisement
Send to a friend

Email this article to a friend or colleague:


PLEASE NOTE: Your name is used only to let the recipient know who sent the story, and in case of transmission error. Both your name and the recipient's name and address will not be used for any other purpose.


ComputerworldUK Webcast

ComputerworldUK
Share
x
Open
* *