Three months after a software upgrade caused Microsoft's Windows validation system to finger legitimate users as pirates, the company is detailing a series of steps it has taken to try to prevent a repeat of the problems.
In a blog posting, Alex Kochis, senior product manager for Microsoft's Windows Genuine Advantage software, outlined new processes that the WGA group has instituted.
"We've revamped the monitoring that is used to track what's happening within our server infrastructure so that we can identify potential problems faster, ideally before any customer gets impacted," Kochis wrote on Microsoft's MSDN Web site.
He said the WGA team has also changed how it updates the back-end servers that host the anti-piracy software.
Microsoft blamed WGA's 19-hour August meltdown on human error, saying that "preproduction code" had been installed on the live servers, which then began declining legitimate requests from Windows XP and Windows Vista users. A software rollback fixed the problem on the activation servers, but not on the servers that validate downloads and other post-activation transactions.
Since then, the company has conducted "more than a dozen 'fire drills' designed to improve our ability to respond to issues affecting customers," Kochis wrote. The drills, he added, have included both pre-announced simulations and surprise alerts.
"The team is now better prepared overall to take the right action and take it quickly," Kochis promised.
Michael Cherry, an analyst at Directions on Microsoft, applauded Microsoft's willingness to acknowledge that its processes had failed during the August outage.
But Cherry lamented the apparent lack of any modifications to the WGA technology itself. He said that if legitimate users can't validate their copies of Windows because of a glitch in WGA, the software shouldn't label them as pirates.
"They should make it so that any impact is on Microsoft," Cherry said, "and not on the customer."