[div class="spacerHeadline"][span style="font-weight: bold;" class="headline"]US plans massive data sweep[/span]
[/div][div class="spacer14"][span class="subhead"]
Little-known data-collection system could troll news, blogs, even e-mails. Will it go too far?
[/span][/div][div class="spacer21"][span style="font-style: italic;" class="byline"]By [a href="vny!://www.csmonitor.com/cgi-bin/encryptmail.pl?ID=CDE1F2EBA0C3ECE1F9F4EFEE&url=/2006/0209/p01s02-uspo.html"]Mark Clayton[/a] [/span][span style="font-style: italic;" class="staffline"]| Staff writer of The Christian Science Monitor[/span]
[/div][!-- Begin Body Text --][span class="text"]
The US government is developing a massive computer system that cancollect huge amounts of data and, by linking far-flung information fromblogs and e-mail to government records and intelligence reports, searchfor patterns of terrorist activity.[!-- --]The system - parts of which are operational, parts of which arestill under development - is already credited with helping to foil someplots. It is the federal government's latest attempt to use broaddata-collection and powerful analysis in the fight against terrorism.But by delving deeply into the digital minutiae of American life, theprogram is also raising concerns that the government is intruding toodeeply into citizens' privacy.[/p][span class="text"]"We don't realize that, as we live our lives andmake little choices, like buying groceries, buying on Amazon, Googling,we're leaving traces everywhere," says Lee Tien, a staff attorney withthe Electronic Frontier Foundation. "We have an attitude that no onewill connect all those dots. But these programs are about connectingthose dots - analyzing and aggregating them - in a way that we haven'tthought about. It's one of the underlying fundamental issues we haveyet to come to grips with."[/p]The core of this effort is a little-known system called Analysis,Dissemination, Visualization, Insight, and Semantic Enhancement(ADVISE). Only a few public documents mention it. ADVISE is a researchand development program within the Department of Homeland Security(DHS), part of its three-year-old "Threat and Vulnerability, Testingand Assessment" portfolio. The TVTA received nearly $50 million infederal funding this year.[/p]DHS officials are circumspect when talking about ADVISE. "I've heardof it," says Peter Sand, director of privacy technology. "I don't knowthe actual status right now. But if it's a system that's beendiscussed, then it's something we're involved in at some level."[/p][p style="font-weight: bold;"][span class="divvy"]Data-mining is a key technology[/span][/p]A major part of ADVISE involves data-mining - or "dataveillance," assome call it. It means sifting through data to look for patterns. If asupermarket finds that customers who buy cider also tend to buyfresh-baked bread, it might group the two together. To prevent fraud,credit-card issuers use data-mining to look for patterns of suspiciousactivity.[/p]What sets ADVISE apart is its scope. It would collect a vast arrayof corporate and public online information - from financial records toCNN news stories - and cross-reference it against US intelligence andlaw-enforcement records. The system would then store it as "entities" -linked data about people, places, things, organizations, and events,according to a report summarizing a 2004 DHS conference in Alexandria,Va. The storage requirements alone are huge - enough to retaininformation about 1 quadrillion entities, the report estimated. If eachentity were a penny, they would collectively form a cube a half-milehigh - roughly double the height of the Empire State Building.[/p]But ADVISE and related DHS technologies aim to do much more,according to Joseph Kielman, manager of the TVTA portfolio. The key isnot merely to identify terrorists, or sift for key words, but toidentify critical patterns in data that illumine their motives andintentions, he wrote in a presentation at a November conference inRichland, Wash.[/p]For example: Is a burst of Internet traffic between a few people theplotting of terrorists, or just bloggers arguing? ADVISE algorithmswould try to determine that before flagging the data pattern for ahuman analyst's review.[/p]At least a few pieces of ADVISE are already operational. ConsiderStarlight, which along with other "visualization" software tools cangive human analysts a graphical view of data. Viewing data in this waycould reveal patterns not obvious in text or number form. Understandingthe relationships among people, organizations, places, and things -using social-behavior analysis and other techniques - is essential togoing beyond mere data-mining to comprehensive "knowledge discovery indatabases," Dr. Kielman wrote in his November report. He declined to beinterviewed for this article.[/p][p style="font-weight: bold;"][span class="divvy"]One data program has foiled terrorists[/span][/p]Starlight has already helped foil some terror plots, says JimThomas, one of its developers and director of the government's newNational Visualization Analytics Center in Richland, Wash. He can'telaborate because the cases are classified, he adds. But "there's noquestion that the technology we've invented here at the lab has beenused to protect our freedoms - and that's pretty cool."[/p]As envisioned, ADVISE and its analytical tools would be used byother agencies to look for terrorists. "All federal, state, local andprivate-sector security entities will be able to share and collaboratein real time with distributed data warehouses that will provide fullsupport for analysis and action" for the ADVISE system, says the 2004workshop report.[/p][/span][/p][/span][p style="font-weight: bold;"][span class="divvy"]A program in the shadows[/span][/p]Yet the scope of ADVISE - its stage of development, cost, and mostother details - is so obscure that critics say it poses a major privacychallenge.[/p]"We just don't know enough about this technology, how it works, orwhat it is used for," says Marcia Hofmann of the Electronic PrivacyInformation Center in Washington. "It matters to a lot of people thatthese programs and software exist. We don't really know to what extentthe government is mining personal data."[/p]Even congressmen with direct oversight of DHS, who favor data mining, say they don't know enough about the program.[/p]"I am not fully briefed on ADVISE," wrote Rep. Curt Weldon (R) ofPennsylvania, vice chairman of the House Homeland Security Committee,in an e-mail. "I'll get briefed this week."[/p]Privacy concerns have torpedoed federal data-mining efforts in thepast. In 2002, news reports revealed that the Defense Department wasworking on Total Information Awareness, a project aimed at collectingand sifting vast amounts of personal and government data for clues toterrorism. An uproar caused Congress to cancel the TIA program a yearlater.[/p][p style="font-weight: bold;"][span class="divvy"]Echoes of a past controversial plan[/span][/p]ADVISE "looks very much like TIA," Mr. Tien of the ElectronicFrontier Foundation writes in an e-mail. "There's the same emphasis onbroad collection and pattern analysis."[/p]But Mr. Sand, the DHS official, emphasizes that privacy protectionwould be built-in. "Before a system leaves the department there's beena privacy review.... That's our focus."[/p]Some computer scientists support the concepts behind ADVISE.[/p]"This sort of technology does protect against a real threat," saysJeffrey Ullman, professor emeritus of computer science at StanfordUniversity. "If a computer suspects me of being a terrorist, but justsays maybe an analyst should look at it ... well, that's no big deal.This is the type of thing we need to be willing to do, to give up acertain amount of privacy."[/p]Others are less sure.[/p]"It isn't a bad idea, but you have to do it in a way thatdemonstrates its utility - and with provable privacy protection," saysLatanya Sweeney, founder of the Data Privacy Laboratory at CarnegieMellon University. But since speaking on privacy at the 2004 DHSworkshop, she now doubts the department is building privacy intoADVISE. "At this point, ADVISE has no funding for privacy technology."[/p]She cites a recent request for proposal by the Office of NavalResearch on behalf of DHS. Although it doesn't mention ADVISE by name,the proposal outlines data-technology research that meshes closely withtechnology cited in ADVISE documents.[/p]Neither the proposal - nor any other she has seen - provides any funding for provable privacy technology, she adds.[/p][table style="border: 1px solid rgb(102, 102, 102);" border="0" cellpadding="10" cellspacing="0" width="400"][tbody][tr style="background: rgb(204, 204, 187) none repeat scroll 0% 50%; mozbackground-clip: mozinitial; mozbackground-origin: mozinitial; mozbackground-inline-policy: mozinitial;"] [td class="divvy"]Some in Congress push for more oversight of federal data-mining[/td][/tr][!-- INSERT ODD AND EVEN ROW SNIPPETS HERE --][!-- ODD ROW --][tr style="background: rgb(238, 238, 221) none repeat scroll 0% 50%; mozbackground-clip: mozinitial; mozbackground-origin: mozinitial; mozbackground-inline-policy: mozinitial;"][td class="head4"]Amid the furor over electronic eavesdropping by the NationalSecurity Agency, Congress may be poised to expand its scrutiny ofgovernment efforts to "mine" public data for hints of terroristactivity.[/p]"One element of the NSA's domestic spying program that has gottentoo little attention is the government's reportedly widespread use ofdata-mining technology to analyze the communications of ordinaryAmericans," said Sen. Russell Feingold (D) of Wisconsin in a Jan. 23statement.[/p]Senator Feingold is among a handful of congressmen who have in thepast sponsored legislation - unsuccessfully - to require federalagencies to report on data-mining programs and how they maintainprivacy.[/p]Without oversight and accountability, critics say, evenwell-intentioned counterterrorism programs could experience missioncreep, having their purview expanded to include non- terrorists - oreven political opponents or groups. "The development of this type ofdata-mining technology has serious implications for the future ofpersonal privacy," says Steven Aftergood of the Federation of AmericanScientists.[/p]Even congressional supporters of the effort want more information about data-mining efforts.[/p]"There has to be more and better congressional oversight," says Rep.Curt Weldon (R) of Pennsylvania and vice chairman of the Housecommittee overseeing the Department of Homeland Security. "But therecan't be oversight till Congress understands what data-mining is. Thereneeds to be a broad look at this because they [intelligence agencies]are obviously seeing the value of this."[/p]Data-mining - the systematic, often automated gleaning of insightsfrom databases - is seen "increasingly as a useful tool" to help detectterrorist threats, the General Accountability Office reported in 2004.Of the nearly 200 federal data-mining efforts the GAO counted, at least14 were acknowledged to focus on counterterrorism.[/p]While privacy laws do place some restriction on government use ofprivate data - such as medical records - they don't preventintelligence agencies from buying information from commercial datacollectors. Congress has done little so far to regulate the practice oreven require basic notification from agencies, privacy experts say.[/p]Indeed, even data that look anonymous aren't necessarily so. Forexample: With name and Social Security number stripped from theirfiles, 87 percent of Americans can be identified simply by knowingtheir date of birth, gender, and five-digit Zip code, according toresearch by Latanya Sweeney, a data-privacy researcher at CarnegieMellon University.[/p]In a separate 2004 report to Congress, the GAO cited eight issuesthat need to be addressed to provide adequate privacy barriers amidfederal data-mining. Top among them was establishing oversight boardsfor such programs.[/p][/td][/tr][/tbody][/table][span class="text"][span class="text"][/span][/p] [/span]