Collecting Information From People

People Know Less Than They Think

It ain’t what you don’t know that gets you into trouble. It’s what you know for sure that just ain’t so.

— commonly attributed to Mark Twain

When you’re diagnosing an intermittent problem you haven’t seen yourself, you rely on the observations of others. But people’s memories and technical skills vary greatly. A tricky situation is when someone who’s reasonably technical has already diagnosed the problem for you, but they’re wrong.

Device A is acting up. Joe replaces a switch and Device A works better again, at least for a while. The old switch tests good, but since replacing it fixed the problem, Joe concludes the switch was the problem. When the failure recurs, Joe replaces the switch again, and it gets better. “Man, that thing is sure going through switches,” Joe says as he tosses a second perfectly-good switch into the bin. What Joe doesn’t realize is that each time he replaced the switch, he had to remove a control panel, which jiggled all of the wires on the panel and temporarily reconnected a bad crimp joint on a quick-disconnect a foot away. But he’ll swear up and down that the switch was the problem. And thus begins the myth of the machine that eats switches.

Unscrewing the panel, moving the panel around, replacing the switch, and putting the whole thing back together made the problem go away — temporarily. All you know is that one of those things helped — and only temporarily. The old switch testing good, and the new switch “failing” soon after installation are red flags. To narrow it down further requires further detailed testing.

People Know More Than They Think

Tom and Ray Magliozzi of Car Talk fame were great at this: a caller’s car makes a noise. After the have-the-caller-make-the-noise-over-the-phone bit, they’d start asking questions. When does it happen? “It only happens when I go to my sister’s house.” It’s pretty unlikely the part making the noise knows it’s at her sister’s house, so the lazy troubleshooter ignores this information. But the smart troubleshooter, like Tom and Ray, asks more questions. What’s special about the trip to the sister’s house? Is it the only time the car gets driven for longer than 20 minutes? Does the sister live on a crappy road full of potholes? Or is there a really sharp uphill turn into the sister’s driveway that requires accelerating the car at full steering lock, revealing the real problem — a bad CV joint?

The Kernel of Truth

People are better at reporting observations than they are at explaining them. This is the basis of mythology: the volcano erupted because the gods are angry because you didn’t sacrifice enough goats. Just because we now know that volcanic activity isn’t appreciably influenced by killing livestock, doesn’t mean the volcano didn’t erupt.

In the case of the load manager, the firefighters driving the engines said that things got worse when they switched on more stuff (lights, sirens, etc.). They figured that the load manager must have been getting overloaded, so it started shedding loads. That’s not a crazy theory — after all, that’s what a load manager does. But the theory was wrong, and it led them down a very expensive path that did nothing to fix the problem.

When I examined things more closely, I found that a) the load manager has no measure of how much load is on its outputs, and b) it wasn’t actually trying to shed any loads. So I knew their explanation was wrong (also because the previous repairs hadn’t helped). However, this only disproved their explanation — not their original observations that turning on more switches made things worse.

It turns out that things did get worse when more loads were switched on — but not because of excessive loads. It was because turning on more control switches put more current through the single ground wire that served all of the low-current control switches on that panel, AND that wire had a marginal connection in the loom, AND the load manager input circuit was particularly sensitive to imperfectly-grounded control switches.

