Weekly Standard misleadingly touted “data-mining” as “crucial” to terror investigations

A Weekly Standard editorial criticized the Bush administration for not hyping “data-mining,” demonstrated by the National Security Agency's reported data collection program, as “a crucial tool against unknown mass-murderers.” The editorial offered little to justify the claim that “data-mining” is “a crucial tool,” though there are experts who question the utility of “data-mining” in terrorism investigations -- specifically the type of “data-mining” the in which NSA is allegedly engaged.

In an editorial in the May 22 edition of The Weekly Standard, Manhattan Institute scholar Heather Mac Donald criticized the Bush administration for not hyping “data-mining” as “a crucial tool against unknown mass-murderers.” Mac Donald was referring to the National Security Agency's (NSA) reported data collection program to cull the telephone records of millions of Americans and analyze them for patterns indicating possible terrorist activity. MacDonald offered little to justify her claim that “data-mining” is “a crucial tool”; indeed, there are experts who question the utility of “data-mining” in terrorism investigations -- specifically the type of “data-mining” the in which NSA may be engaged.

Mac Donald also repeated the claim that the NSA is only tracking phone calls and not listening to them without noting that the programs are reportedly linked -- because the call-records analysis “helps the NSA choose its targets for listening.”

From Mac Donald's editorial in the May 22 Weekly Standard:

Ever since allowing the Pentagon's Total Information Awareness project to go down the tubes in 2003, the administration has failed to explain the potential of data mining, even as it secretly continues to use this vital technology. Thus, at every revelation of a government data mining program, privacy extremists enjoy unchallenged supremacy in characterizing the technology as a massive threat to life as we know it.

[...]

The Washington Post calls this numbers analysis the “most extensive ... domestic surveillance [program] yet known involving ordinary citizens and residents.” Bunk. The NSA's data mining program is not surveillance; no one is being listened to or observed.

Data mining looks for mathematical patterns in computerized information; it is not a real-time spying operation. The government didn't need to go to the Foreign Intelligence Surveillance Court for a wiretap or pen register order (which governs the collection of phone numbers in real time from a single phone) because it is not listening to or recording any individual's calls. FISA is built around the notion of an individualized investigation of specific spies or terrorists; it is seriously outdated for the application of American computer know-how to ferret out terror plots before they happen and before the government has individual suspects in mind.

But it may be too late to convey these truths. The time to explain how data mining protects privacy while providing a crucial tool against unknown mass-murderers was while the Pentagon's Total Information Awareness program was under attack. That program, which hoped to uncover patterns of terrorist activity in publicly available commercial data, was merely in its preliminary research stages, but the Senate killed it in a demagogic display of privacy hysteria.

Despite Mac Donald's affirmative characterization of data mining as “a crucial tool,” the merits of data mining in terror investigations have been questioned. For example, Newsweek technology columnist Steven Levy noted in his column for the May 22 edition that the utility of data mining is, as yet, undetermined: “While the practice works wonders in detecting credit-card fraud and targeting direct-marketing prospects, it's yet to be proved that the techniques of data mining can zoom in on terrorist behavior from billions of phone records.” Levy continued:

The bane of such schemes is the false positive, when a suspicious pattern sounds an alarm bell-but turns out to be benign. This can be a real problem when looking for the evasive trail of a terrorist, as opposed to the relatively common and recognizable footprints of a credit-card thief. “When a disease is rare, even an accurate medical test has an excessive failure rate,” says Bruce Schneier, chief technical officer of the Counterpane security firm. "[Data mining for terrorists] is a huge waste of money for very little return."

Moreover, though it is unclear exactly how the NSA analyzes the phone records it collects, Levy noted that Valdis Krebs, a “social network analysis” expert, indicated that certain analysis methods are more effective than others, and that the sheer volume of data the NSA has reportedly collected suggests that it employs an unwieldy and ineffective method, and also that it has unnecessarily collected vast amounts of useless data. According to Levy:

The NSA's historic request for the nation's phone logs signals a desire to perform massive “traffic analysis” of calls within the U.S. -- an examination of who calls whom, when they call and for how long -- to identify potential threats. This in turn is expected to be used for the kind of analysis that Krebs performed. But Krebs says you don't need the indiscriminate volume of phone records requested by NSA in order to perform effective social network analysis. The best way to snare the bad guys is to “go bottom up,” he says, beginning with the bad guys, charting only the people in their circles and investigating from there.

Of course it's possible that the NSA will only tread within those narrow boundaries: but if that's so, why would our spooks need everyone's records? By asking only for the calls of suspected terrorists and their contacts, the agency could avoid the painful (and possibly illegal) tradeoff of handing over the telephonic fingerprints of millions of innocent Americans who never get within spitting distance of a malfeasant.

Levy did note that Robert Popp, the second-ranking official in the Pentagon's now-defunct Total Information Awareness (also known as Terrorism Information Awareness) project to which Mac Donald referred, defends data mining as a tool for terror investigations -- indicating that there is, at best, disagreement among experts regarding the efficacy of data mining. According to Levy:

Robert Popp, who was second ranking official of the Defense Department's discontinued Terrorist Information Awareness office, says false positives can be mitigated by a “multistage analysis” involving other sources of information. If an analysis of phone records unearths a daunting proliferation of leads, those could be matched up with other databases, like credit-card records or Internet activity. “With lots of data, you could then corroborate and accrue sufficient evidence,” says Popp. If that technique works, the leads could be winnowed down before the FBI is sent out to knock on doors.

A May 15 Christian Science Monitor article also cited Krebs and “other experts” voicing skepticism about a “top-down” analysis method:

What Mr. Krebs, a Cleveland-based expert in social network analysis, did in a small way was what he and others say US agencies are trying to do on a much larger scale - by piling up mountains of data to sift for unseen patterns that reveal hidden terrorist networks.

“This can be a very powerful tool,” he says, if the technique is used to “drill down” into known terrorists for their associates and those they call and connect with.

But Krebs and other experts are doubtful about the opposite method, a “top down” approach in which the bulk of data on innocent Americans is sifted for terrorists.