Bugs in the Space Program

The talk:

This is the keynote address I gave at the INCOSE International Symposium on Systems Engineering, in July 2005.

The slides for the talk.

Abstract:

Software has an important role to play in making modern systems more flexible, adaptable and autonomous. But we don't yet have a mature engineering discipline for software development. For the systems engineer, important questions are still unanswered: how risky is software in comparison with other parts of a system? Can software be treated as 'just another component'? Or does software demand special attention in systems engineering?

The emerging field of software forensics can shed some light on these questions. By investigating the circumstances surrounding software failures, we get a sense of the risks involved. In this talk, I will use a series of case studies from the space program to draw out some crucial lessons. The examples include the European Space Agency's original Ariane-5 launch vehicle, and several of NASA's Mars probes. Each of these case studies makes a fascinating story in its own right. In each case, the failure appears to be a normal accident: a relatively simple technical problem led to a systems failure because a whole series of systems engineering mistakes allowed it to. However, the failure profiles in these cases reveal some of the key distinguishing characteristics of software. These characteristics have important implications for systems engineering.

The Resource List:

Recommended books:

Space Shuttle

Ariane-5

Mariner 1

Mars Observer

Mars Pathfinder

Mars Climate Orbiter

Mars Polar Lander & Deep Space 2

General Resources