Analysts slam Microsoft reliability after WGA meltdown

The 19-hour blackout of the Microsoft servers that identify copies of Windows XP and Vista as legitimate or counterfeit shows that serious flaws exist in the process and raises questions about the reliability of Microsoft's services, analysts said Monday.

Share

The 19-hour blackout of the Microsoft servers that identify copies of Windows XP and Vista as legitimate or counterfeit shows that serious flaws exist in the process and raises questions about the reliability of Microsoft's services, analysts said Monday.

The incident began on Friday evening (24 August) and lasted until 3pm Saturday. A still-unspecified "server-side issue" with the system that validates Windows XP and Vista erroneously fingered users as pirates, preventing them from downloading most software from the Microsoft website.

In the case of Vista, it disabled several features, including the operating system's Aero graphical user interface. Windows users lit up the company's support forums with more than 450 messages, some of which were collected in threads have been viewed by as many as 45,000 people.

As of midday Monday, Microsoft had not explained the problem with the Windows Genuine Advantage (WGA) servers, although on Saturday programme manager Phil Liu promised that after the team had generated a fix, "[I will] get you all what you are looking for, an explanation and cause."

Michael Cherry, an analyst at independent analsyt firm Directions on Microsoft, took the company to task over the snafu. "Despite the fact that Microsoft has rolled out WGA slowly and methodically to ensure they have the capacity, availability and reliability to handle customer validation requests, it appears that any plans they had to handle a service problem are not adequate.

"Why don't they have a workable fail-over strategy for this service? What does this say about the resiliency of Microsoft's services? After all, there will be failures," he added.

Gartner analyst Michael Silver also dinged Microsoft on the reliability issue. "A system that's not totally reliable really should not be so punitive," he said. "This issue is not really how long it take for Microsoft to fix the problem, but also how when the user can get back on the network to revalidate. What happens when someone's about to get on a plane and won't be able to revalidate for three days?"

On Saturday, users raged that the outage prevented them from doing work – at least one said he was a developer and couldn't access the update to DirectX because his machine had been falsely flagged – or playing games. Others asked why they had effectively been tagged as pirates.

"It's really hard to say if the system is fragile," said Cherry, in response to a question. "Let's say that the system runs without problems for six months – how many successful validations occur? But if you are the one person who fails for no fault of your own during that six month period, then the system is too fragile."

Find your next job with computerworld UK jobs