September 30, 2011
Moneyball has opened in the movie theaters, starring Brad Pitt as the redoubtable Billy Beane, general manager of the Oakland Athletics. Beane had been a future baseball superstar who washed out after three seasons of bench-warming and bouncing between Triple-A and major league teams. He was, however, that rare individual who resisted his own PR, which was certain and glowing, and the perfect foil for the movie’s real subject: baseball statistics and how numbers could guide a poor team like the A’s to the playoffs more regularly than the conventional wisdom said was possible.
The A’s, as Michael Lewis, author of the book on which it is based describes the team, was populated with Major League’s castoffs – catchers and pitchers who looked more like beer-swilling bowlers, college prospects that the talent scouts had deemed too cold to bother with, and a computer nerd for an assistant GM. But Beane knew from his own field experience just how wrong Major League Baseball could be in assessing players. He chose, instead, to exploit a different set of metrics developed by a passionate fan who happened to be statistically apt, but shunned by the rest of Beane’s peers. Using this recipe, he cobbles together victory after victory. The book roughly follows the A’s during the 2002 season, from the June amateur player’s draft to the historic September 4th game, in which the A’s won their 20th consecutive game – breaking a 102-year old American League record.
Among the book’s heroes is Bill James. James is the quirky inventor of Sabermetrics, the statistical analysis of baseball data, used to better understand success and failure in the game. James had a restless, numerically inclined mind and, it would be fair to say, an obsessive interest in baseball. The marriage of his aptitude and his interests produced a series of abstracts and books on baseball metrics, beginning in 1977. At first, James was ignored by all but a handful of fans. But, eventually, he gained a following. And in 1997 one of them became the manager of the Oakland A’s.
If the narrative has a villain, it’s Major League Baseball itself. Wholly in the thrall of its own myths, this billion dollar enterprise, when it bothered with data at all, used it incorrectly. Rich teams could skim the obvious cream of talent, but beyond buying wins, hadn’t a clue how it was really done.
Lewis’s point was that even with huge sums of money at stake, those most vested in the outcome can be utterly blinded by their own subjective biases. Once biases are institutionalized, they become facts, stubborn in the face of new or competing or different information. Those who can’t even identify the data that should be informing critical decisions – let alone using it properly – are bound to make expensive mistakes.
So what’s The Safety Record’s point? We saw a number of striking parallels to the way NHTSA operates, when it comes to numbers. The agency is, ostensibly, data-driven and science-based. Hell, it slaps that phrase on everything from Powerpoint presentations to appropriations requests. But there’s plenty of evidence that the agency dismisses – out-of-hand – outside information; that long-held beliefs hamper its data analyses; that it pays way too much attention to data that is not useful and fails to collect data that is.
Moneyball: For example, here Lewis characterizes James’ assessment of MLB’s metrics:
“Worse, baseball teams didn’t have the sense to know what to collect, and so an awful lot of critical data simply went unrecorded.”
NHTSAball: There’s lots of data NHTSA doesn’t collect. For example, the agency has promulgated a recall system for tires that is predicated solely on the Tire Identification Number. Years of rulemaking were devoted to debating the format of the TIN, popularly known as the DOT code, an 11-character, alpha-numeric code that represents the plant, size, date of manufacture molded into the sidewall of a tire. The aim of the TIN was to identify a recalled tire and it is the only definitive measure to do so. Does NHTSA require tire makers who recall their products to list the range of TINs in the recall population? No, it does not. Some manufacturers report it; others don’t. NHTSA’s good, either way. The DOT code would be a nifty little data point in any tire-related fatality in the Fatality Analysis Reporting System. The agency could use such data to identify tire failure trends, but it remains information uncollected. And while we’re on tires, NHTSA doesn’t require manufacturers to report tire claims if the tire is older than five years – so no data, no problem for the aged tire issues that took center stage following Firestone recalls in 2000 and 2001 and remain a constant source of catastrophic crashes.
The GAO criticized NHTSA in a June report on the agency’s recall practices for not using recall repair rate data to analyze trends and institute best recall practices:
“Based on our analysis of NHTSA data, without conducting a broader aggregate level analysis to look for outliers, patterns, or trends, the agency may be missing an opportunity to identify underlying factors that affect recall campaign completion rates.” (see GAO Study: NHTSA Needs Improvement)
The DOT’s Office of the Inspector General has twice rapped the Office of Defects Investigations, in 2002 and 2004, for failing to have a systematic defect screening process to “identify high priority cases and to ensure a degree of consistency in the decision making process.”
Moneyball: On the concept of collecting fielding error stats, James concluded, according to Lewis: “…the concept of an error, like many baseball concepts, was tailored to an earlier, very different game… The statistics were not merely inadequate, they lied. And the lies they told led the people who ran major league baseball teams to misjudge their players and mismanage their games.”
NHTSAball: Here we have to bring up one of our enduring themes – NHTSA’s inability to look beyond driver error in Unintended Acceleration complaints. The agency’s intractable position was cemented into place in 1989, with the publication of “An Examination of Sudden Acceleration,” the agency’s contracted report intended to end the debate on what caused these seemingly inexplicable events. Allan Kam, who worked as an enforcement attorney for 25 years, explains that the study, called the Silver Book within the agency in reference to the color of its cover, took on an almost religious status within ODI.
“When the Silver Book came out that was viewed by ODI staff as the Bible – it was validation that SUA equals pedal misapplication,” Kam said. “Thereafter, reports of alleged SUA were compartmentalized as pedal misapplication. There was an institutional bias against the credibility of consumers reporting SUA—they must be mistaken; they were rationalizing it because they didn’t want to take responsibility. It was thought to be mechanically impossible. Of course, when that report was prepared there were less complicated electronics on vehicles.”
The agency has ample data and evidence that electronics are implicated in Toyota UA events – the discovery of tin whiskers in accelerator pedal electronics, Toyota’s own dealers and field technical staff reports, an unexplained spike in consumer complaints after the introduction of electronic throttle controls in popular Toyota models; incidents in which pedal misapplication is not plausible and pedal interference has not occurred. And yet, it’s The Silver Book, first, last and always.
Moneyball: Lewis writes about the frustration that James and others encountered while trying to get more data from the majors and then the indifference club managers showed to valid performance indicators that didn’t fit their pre-conceived notions:
“The people inside Major League Baseball were, if anything hostile to the people outside Major League Baseball who wished to study the game.”
NHTSAball: There are many who have attempted to get NHTSA to look at vehicle-related safety problems and few have gotten any further than a polite hour or so in a conference room at headquarters. Safety advocate Janette Fennell has enjoyed an almost unparalleled success in advancing safety issues within NHTSA – and none of her accomplishments are the direct result of such meetings.
Fennell, the survivor of a kidnapping in which she was locked in the trunk of her own car, has been the driving force in compelling NHTSA to institute a rule for an emergency trunk release; to count deaths and injuries to children that occur in vehicle-related incidents that occur off the public roadways, such as power windows and backovers; and to begin rulemaking for a rearward visibility standard. Each advance was at the behest of Congress, which passed legislation compelling the agency to collect this information or to promulgate a rule. She recounts some of her earliest attempts to get NHTSA to look at the data she had collected regarding the number of individuals who found themselves locked in a trunk either by a criminal act or by happenstance, and the data about the number of children who were accidentally backed over by caregivers who didn’t see the child, due to the vehicle’s poor rearward visibility. NHTSA just didn’t collect vehicle-related injury and death data if it didn’t occur on a public roadway, she was informed. There was a rulemaking department, data collection department, human factors department, a child passenger safety department – but officials couldn’t easily categorize the safety issues she brought to their attention.
“There was skepticism,” she recalls. “I was politely seen and listened to, but they didn’t do anything about it partly because the data fit everywhere and yet it didn’t neatly fit anywhere. There were all these built in-prejudices and Big Government isn’t going to listen to a mom from Kansas,” she says “It’s a cliché – but it took an act of Congress.”
These are but a few examples. And, we could go on – about NHTSA’s data battles with the rigorous researchers at the Insurance Institute on Highway Safety, how Early Warning Reporting data is serving as an identifier of failed recalls, instead of as an early warning system, how the agency continues to use warranty data as a statistical benchmark for confirming defect trends, even though it is time-limited and is held by the manufacturers, and not subject to scrutiny for its biases.
In Major League Baseball’s case, it took two outliers – Bill James, who re-defined the statistical contours of the game, and Billy Beane, who applied them – to engender a quiet revolution in baseball. As a Wall Street Journal article on the movie points out, statistics aren’t the sum and total of the game. Once Beane’s blueprint for cheap wins was revealed, and other teams began to use it, the A’s lost a competitive advantage. Nonetheless, the bottom fell out of the value of the old metrics, and better metrics took hold. Even if they aren’t plugging the Runs Created formula into every player’s record, there is no doubt that the majors look at their own data differently.
What will it take to start a statistical revolution at NHTSA? Hard to say.
While there is no shortage of knowledgeable outsiders who are conducting valid and valuable statistical auto safety research, we’ve yet to see that brave outlier within the agency who is willing to push against the weight of institutional intransigence to information it doesn’t like until it topples. It is unlikely that a Michael Lewis-type will come along to turn NHTSA’s statistical deficiencies into a clever non-fiction book. We definitely do not foresee a Hollywood movie in the offing.
NHTSAball, in fact, has very few spectators. And while there can be enormous sums of money at stake –recalls can cost an automaker a lot of money – unlike baseball, bad numbers don’t just lose games. They lose lives.