James PustejovskyNovember 8, 2016

By Lawrence Goodman | BrandeisNOW

Apple has Siri. Google has Google Assistant.

Brandeis computer scientist James Pustejovsky wants to ensure the public — and especially academic researchers — have access to an even more powerful language recognition system.

He and his colleagues at Vassar, Carnegie Mellon University and the Linguistic Data Consortium at the University of Pennsylvania have developed a platform (the LAPPS Grid) that seamlessly connects open-source computer programs to quickly analyze huge amounts of language from diverse sources and genres. The programs identify the words, figure out their overall meaning, and finally, help to uncover hidden relationships embedded in the data.

With a two-year, $390,000 grant from the Andrew W. Mellon Foundation, he and colleagues from around the world will dramatically extend the range of this platform by linking it to a similarly broad and extensive one known as the European Common Language Resources and Technology Infrastructure (CLARIN). "I can't talk to their stuff, they can't talk to our stuff," says Pustejovsky. "We're trying to bring these two big services together."

While this sounds rather technical, the consequences will be transformative and far-reaching, Pustejovsky says. "It will effectively create an 'Internet of language applications' for the everyday computer user," he says.

A historian researching the Vietnam War, for example, could feed the computer program audio from every TV newscast in the sixties, all the text from newspaper articles and reams of Defense Department documents. She then could look for connections between the way the media covered the war and how the government reacted.

Authorized health researchers could sift through medical records and recorded interviews from millions of people to determine how patients responded to various treatments.

"We're going to give every scholar access to a toolkit that's now only available to the largest corporations," says Pustejovsky, the TJX/Feldberg Chair of Computer Science. No programming experience will be necessary. "We don't want users to have to know the details of the computer programs," Pustejovsky says.

The other lead investigators in on the Mellon grant are: Nancy Ide (Vassar College)Erhard Hinrichs (University of Tuebingen) and Jan Hajic (Charles University in Prague)