Safety Critical Systems
Came across this from somewhere. The guy who wrote it has been working for around 5 years in Air Traffic Control projects, both in delivery of radar processing and displays and in R&D for next generation systems.
Here is his overview of the failure approach of a safety critical (if it fails, people could die) system :
1) Everything on Unix, ruggedised releases of UNIX
2) Every box must be able to FAIL ON ITS OWN
3) Every box must have a direct replacement, or replacements, which carry the SAME LOAD.
4) ZERO total system downtime allowed, partial systems failures are allowed, but core systems must keep running.
5) 5 stages of power supply failure, double mains, double generation and lastly a great big warehouse of car batteries if all else fails.
6) 4 Years of testing of FULL system before live.




<< Home