Windows boxes cause air traffic control radio outage in LA?
A major breakdown in Southern California's air traffic control system last week was partly due to a "design anomaly" in the way Microsoft Windows servers were integrated into the system, according to a report in the Los Angeles Times.
The servers are timed to shut down after 49.7 days of use in order to prevent a data overload, a union official told the LA Times. To avoid this automatic shutdown, technicians are required to restart the system manually every 30 days. An improperly trained employee failed to reset the system, leading it to shut down without warning, the official said. Backup systems failed because of a software failure, according to a report in The New York Times.
Wonderful. These are dell servers running windows 2000 Server, and they need to be rebooted every 6 weeks or so. Manually rebooted. Some genius somewhere (presumably at Harris Corp) decided that:
- Having a windows box in a mission-critical, lives-depend-on-it system was a good idea, and that
- Needing to reboot the boxes wasn't a big deal, and that
- they couldn't be bothered to automate the reboot cycle.
I should add that this isn't meant to be a Microsoft bash. It's a shame that MSFT don't do better software, but the reliability problems in their software are pretty well known industry-wide, and deciding to use an OS that requires a 6-week reboot cycle is boneheaded in the extreme when you're talking about life-critical systems. The blame here should be on Harris Corp and the FAA.
I hope someone gets fired over this. And I hope this system isn't running for airspace over the midwest...


