Today EFF issued a report about the Investigative Data Warehouse, a gigantic billion-document storehouse of information maintained by the FBI. In addition, EFF wrote to Senate Judiciary Committee Chairman Patrick Leahy and House Judiciary Committee Chairman John Conyers, asking Congress to examine the IDW.
In August 2006, EFF sought documents about the IDW under the Freedom of Information Act. Three years of litigation later, the FBI has said that no more information will be forthcoming, despite the Obama Administration's new policies on open government. This report is based upon the documents provided in the litigation, as well as public information about the huge data warehouse.
The IDW contains at least 53 datasets, and includes more than four times as many unique documents as the Library of Congress. The report lists 38 of these datasets, which encompass not only information about suspects, but all individuals referenced in FBI investigations. In
addition, the report discusses the systems architecture and technical features of the IDW.
While the FBI has refused to publicly release any Privacy Impact Assessments about the IDW, the report discusses the FBI's efforts to avoid "raising congressional consciousness levels and expectations" about PIAs, and to give a “sense that we really do worry about the privacy interests of uninvolved people whose data we slurp up."
Finally, the report discusses the future of the IDW. Moving forward, the FBI has asked for millions of dollars to increase its use of the IDW for “link analysis” (looking for links between suspects and other people – i.e., the Kevin Bacon game) and to start “pattern analysis” (defining a “predictive pattern of behavior” and searching for that pattern in the IDW’s datasets before any criminal offense is committed – i.e., pre-crime).