----------------------------------------------------------------------
Date: Thu, 9 Jan 86 12:14:34
PST
From: ihnp4!utzoo!henry@ucbvax.berkeley.edu
To: risks@sri-csl.arpa
Subject: Multiple
redundancy
Advocates of multiple
redundancy through independently-written software
doing the same job might
be interested in an incident involving complete
failure of such a scheme.
During the development
of the De Havilland Victor jet bomber, roughly a
contemporary of the
B-52, the designers were concerned about possible
problems with the unusual
tailplane design. They were particularly
worried about "flutter"
-- a positive feedback loop between slightly-flexible
structures and the airflow
around them, dangerous when the frequency of the
resulting oscillation
matches a resonant frequency of the structure. So
they tested for tailplane
flutter very carefully:
1. A specially-built
wind-tunnel model was used to investigate the
flutter behavior.
(Because one cannot scale down the fluid properties
of the atmosphere,
a simple scale model of the aircraft isn't good
enough to check
on subtle problems -- the model must be carefully
built to answer
a specific question.)
2. Resonance tests
were run on the first prototype before it flew,
with the results
cranked into aerodynamic equations.
3. Early flight
tests included some tests whose results could be
extrapolated to
reveal flutter behavior. (Flutter is sensitive to
speed, so low-speed
tests could be run safely.)
All three methods produced
similar answers, agreeing that there was no
flutter problem in the
tailplane at any speed the aircraft could reach.
Somewhat later, when
the first prototype was doing high-speed low-altitude
passes over an airbase
for instrument calibration, the tailplane broke off.
The aircraft crashed
instantly, killing the entire crew. A long investigation
finally discovered what
happened:
1. The stiffness
of a crucial part in the wind-tunnel flutter model
was wrong.
2. One term in the aerodynamic equations had been put in wrongly.
3. The flight-test
results involved some tricky problems of data
interpretation,
and the engineers had been misled.
And by sheer bad luck, all three wrong answers were roughly the same number.
Reference: Bill Gunston, "Bombers of the West", Ian Allen 1977(?).
Henry
Spencer @ U of Toronto Zoology
{allegra,ihnp4,linus,decvax}!utzoo!henry
----------------------------------------------------------------------
----------------------------------------------------------------------
Date: Mon, 13 Jan 86
19:49:18 PST
From: ihnp4!utzoo!henry@ucbvax.berkeley.edu
To: risks@sri-csl.arpa
Subject: Re:
Multiple redundancy
A correction and an addendum
to my earlier contribution about multiple
redundancy...
Correction: It
was not the "De Havilland Victor" but the "Handley Page
Victor". Blush.
That's like calling Boeing "McDonnell Douglas".
Addendum: The full
reference is Bill Gunston, "Bombers of the West",
Ian Allan, London 1973,
page 92.
Henry
Spencer @ U of Toronto Zoology
{allegra,ihnp4,linus,decvax}!utzoo!henry
----------------------------------------------------------------------
----------------------------------------------------------------------
Date: Thu, 27 Mar 86
08:32:18 est
From: hammond%lafite@mouton.ARPA
(Rich A. Hammond at lafite.UUCP)
To: risks@sri-csl.arpa
Subject: Inter-system
crashes
I worked in a hotel once
when they were adding a new wing. The main water
and electricity systems
had to be turned off to connect the new wing.
Management decided to
do both at the same time so there would only be one
interruption in service.
The problem: Turning off the electric power
caused the emergency
generator to come on, but the generator was cooled by
water which came from
the main and ran into the drain, i.e., no
recirculation.
Of course there was no water, the generator engine managed
to warp its head pretty
badly before we shut it off.
----------------------------------------------------------------------
----------------------------------------------------------------------
Sender: "J._Paul_Holbrook.OsbuSouth"@Xerox.COM
Date: 29 Apr 86 14:32:33
PDT (Tuesday)
Subject: The
dangers of assuming too much
From: Holbrook.OsbuSouth@Xerox.COM
To: Risks@SRI-CSL.Arpa,
Methodology^.PA@Xerox.COM
[From "Three Mile Island:
Thirty Minutes to Meltdown" by Daniel Ford;
Viking Press 1982.]
(The discussion preceeding
this quote talks about how the temperature of the
fuel rod at Three Mile
Island-2 increased from the normal 600 degrees to
over 4000 degrees during
the 1979 accident, partially destroying the fuel
rods. It also
notes that instruments to measure core temperatures were not
standard equipment in
reactors.)
"Purely by chance,
there were some thermocouples -- temperature-measuring
devices -- present
in the TMI-2 reactor when the accident occured. Located
about 12 inches
above the top of the core, these thermocouples ... were
installed as
part of an experimental study of core performance, and were a
temporary instrumentation
feature of the plant, connected to the
control-room
computer for measuring temperatures during normal operation.
Accordingly,
if a control-room operator requested temperature data from the
computer, he
would receive useful information only when the temperature was
within the normal
600 degree range. When the temperature got above 700
degrees, the
computer, instead of reporting it, would simply print out a
string of question
marks -- "???????." Although the thermocouples could
actually measure
much higher temperatures, the computer was not programmed
to pass these
higher temperature readings on to the operators ... there was
an urgent need
for timely, reliable data about the temperature in the core
in the critical
period between 6am and 7am on March 28; what was available
from the computer
was mostly question marks."
Paul
----------------------------------------------------------------------
----------------------------------------------------------------------
To: ittatc!CSL.SRI.COM!RISKS
Date: Fri, 1 Aug 86
0:48:48 EDT
From: sdcsvax!dcdwest!ittatc!bunker!wtm@ucbvax.Berkeley.EDU
(Bill McGarry)
Subject: Ozone
hole undetected for years due to programming error
(I read the following
in a magazine but when I went to write this
article, I could
not remember which magazine and some of the exact
details.
My apologies for any inaccuracies.)
Recently, it was disclosed
that a large hole in the ozone layer appears once
a year over the South
Pole. The researchers had first detected this hole
approximately 8 years
ago by tests done at the South Pole itself.
Why did they wait 8 years
to disclose this disturbing fact? Because the
satellite that normally
gives ozone levels had not reported any such hole
and the researchers
could not believe that the satellite's figures could be
incorrect. It
took 8 years of testing before they felt confident enough to
dispute the satellite's
figures.
And why did the satellite
fail to report this hole? Because it had been
programmed to reject
values that fell outside the "normal" range!
I do not know which is
more disturbing -- that the researchers had so much
faith in the satellite
that it took 8 years of testing before they would
dispute the satellite
or that the satellite would observe this huge drop in
the ozone level year
after year and just throw the results away?
Bill McGarry
Bunker
Ramo, Trumbull, CT
{decvax,
philabs, ittatc, fortune}!bunker!wtm
[A truly remarkable saga. I read it too, and was going to report
on it -- but could not find the source. HELP, PLEASE! PGN]
----------------------------------------------------------------------
----------------------------------------------------------------------
Date: Tue, 23 Sep 86
12:54:10 EDT
From: Jim Purtilo <purtilo@brillig.umd.edu>
To: RISKS@csl.sri.com
Subject: Sane
sanity checks / risking public discussion
[Regarding ``sanity checks'']
Let us remember that
there are sane ``sanity checks'' as well as the other
kind. About 8 years
ago while a grad student at an Ohio university that
probably ought to remain
unnamed, I learned of the following follies:
The campus had long been
doing class registration and scheduling via
computer, but the registrar
insisted on a ``sanity check'' in the form of
hard copy. Once
each term, a dozen guys in overalls would spend the day
hauling a room full
of paper boxes over from the CS center, representing a
paper copy of each document
that had anything to do with the registration
process. [I first
took exception to this because their whole argument in
favor of "computerizing"
was based on reduced costs, but I guess that should
be hashed out in NET.TREE-EATERS.]
No one in that registrar's
office was at all interested in wading through
all that paper. Not
even a little bit.
One fine day, the Burroughs
people came through with a little upgrade to the
processor used by campus
administration. And some "unused status bits"
happened to float the
other way.
This was right before
the preregistration documents were run, and dutifully
about 12,000 students
preregistration requests were scheduled and mailed
back to them.
All of them were signed up "PASS/FAIL". This was
meticulously recorded
on all those trees stored in the back room, but no one
wanted to look.
I suppose a moral would
be ``if you include sanity checks, make sure a sane
person would be interested
in looking at them.''
[Regarding break-ins at Stanford]
A lot of the discussion
seems to revolve about ``hey, Brian, you got what
you asked for'' (no
matter how kindly it is phrased). Without making
further editorial either
way, I'd like to make sure that Brian is commended
for sharing the experience.
Sure would be a shame if ``coming clean'' about
a bad situation will
be viewed as itself constituting a risk...
[I am delighted to see this comment. Thanks, Brian! PGN]
----------------------------------------------------------------------
----------------------------------------------------------------------
Date: Wed, 1 Oct 86 16:55:14
pdt
From: ladkin@kestrel.ARPA
(Peter Ladkin)
To: risks@sri-csl
Subject: A
propos landing gear
Alan Marcum's comment
on gear overrides in the Arrow reminded me
of a recent incident
in my flying club (and his, too).
The Beech Duchess, a
light training twin, has an override that
maintains the landing
gear *down* while there is weight on the
wheels, ostensibly to
prevent the pilot from retracting the
gear while on the ground
(this is a problem that Beech has in
some of its airplanes,
since they chose to use a non-standard
location for gear and
flap switches, encouraging a pilot to
mistake one for the
other).
Pilots can get into
the habit of *retracting* the gear before
takeoff, secure in the
knowledge that it will remain down
until weight is lifted
off the wheels, whence it will commence
retracting. This has
the major advantages that it's one less
thing to do during takeoff,
allowing more concentration on
flying, and the gear
is retracted at the earliest possible
moment, allowing maximum
climb performance, which is important
in case an engine fails
at this critical stage.
Can anyone guess the
disadvantage of this procedure yet?
Our club pilot, on his
ATP check ride, with an FAA inspector
aboard, suffered nosewheel
collapse on take-off, and dinged
the nose, and both props,
necessitating an expensive repair
and engine rebuild.
Thankfully, all walked away unharmed.
It was a windy day.
It is popularly supposed
that the premature retraction technique
was used, and a gust
of wind near rotation speed caused the weight
to be lifted off the
nosewheel. When the plane settled, the
retraction had activated,
and the lock had disengaged,
allowing the weight
to collapse the nosewheel.
Both pilots assure that
the gear switch was in the down position,
contrary to the popular
supposition.
All gear systems in the
aircraft were functioning normally when
tested after the accident.
The relevance to Risks?
The system is simple, and understood in
its entirety by all
competent users. The technique of
premature retraction
has advantages. It's not clear that a
gedankenexperiment could
predict the disadvantage.
Peter Ladkin
----------------------------------------------------------------------