Friday, March 4, 2011

Why Cybersecurity Should Focus on Failure

When a computer crashes, our instinct is to reboot and not to question its root cause. But perhaps we should try to understand our failures before trying to forget them. Paul Kocher, president and chief scientist, of Cryptography Research, Inc. in San Francisco thinks that computer security industry’s understanding of failure is still in its infancy, and that security practitioners today should try to learn from other industries that have greatly improved their risk profiles and consumer’s trust over the years. For example, the aviation industry.

In the 1940s “there were about ten deaths per one hundred million passenger miles,” he said. That meant the average passenger would expect to die for every ten million plane miles flown. Today when air travel is much more common most people have flown at least a million or so air miles. In terms of 1940s aviation, most of us would have a 1 in 5 chance of being dead because of a plane crash. With that track record, the aviation industry might not have survived or be as robust as it is today.

Yet we tolerate similar failures and crashes within the computer industry every day.

Kocher said there’s been a thousand-fold improvement in aviation safety over the years because every time a plane crashes, the industry doesn’t say “Oops, that piece of metal broke.” Or “Too bad.” Or “the pilot made that dumb mistake because they didn’t deal with the engine failure properly.” Instead there’s a formal process that leads to exponential improvement in aviation safety.

Every aviation accident gets investigated, and often there is not one, but a number of root causes behind it. “It’s is essentially impossible that one error can bring down an airplane today,” he said, since three, four, or five failures usually compound on each other. With the mandatory use of black boxes, extensive field investigations, and expensive reconstructions, each aviation failure becomes less and less likely in the future.

“In computer security we’re going the other direction,” Kocher said, because the industry doesn’t take a professional, analytic view of failure. Some vendors will spend many months looking for problems that don’t exist. On the other hand, some vendors will only fix the bugs and do no more.

“In aviation industry there’s not an attempt to put gloss around aviation safety to try and convince consumers there’s no possibility of an airplane crash if you carry the magic wand in your hand,” he said. Instead there are individuals and companies that try to gather as much information. They perform a root cause analysis and try to learn as much as they can from each failure.

On the other hand, Kocher said, within computer security if you go to ten practitioners and ask what should you do to solve your particular data security problem, you’ll get ten difference answers. One or two of those solutions may work. Eight of the ten solutions may not.

He compared computer security to medicine in the 1820s “when you had snake oil being sold along with some things that worked well but we may not know why they work.” Even when solutions do work, we often don’t know enough about it to explain why they worked. After more than fifty years, we don’t yet understand the root causes of computer failure.

Kocher cites Moore’s Law, which states that the number of transistors placed on a chip will double every two years. Moore’s Law allows for the inexpensive installation of many additional layers of protection. That way if one piece fails the others will ensure that the overall security properties are met. Eventually if you build up enough barriers “it works but it is not very elegant,” he said. But “it’s like putting thirty layers of concrete bunker around your house, a wooden one, a steal one, etc., and then trying to make them interlock in various ways to keep your teenage daughter from leaving the house at night.”

Kocher said it’s important to understand the underlying motivations as well. Today the computer attacker has more incentive to learn about failures than the solutions vendors. The good guys collect their salaries whether or not a given solution worked. But the bad guys only get paid if they are successful.

This originally appeared on