Research in Data Science at Brandeis
Faculty members at Brandeis conducting innovative research using data science techniques include:
Benjamin Rogers (Physics) does research in soft matter physics and biological physics. His research models gene sequencing. The research demands large amounts of data captured using microscopy. He and his team generate roughly 20TB of data annually. They need both data archiving capabilities and expert advice on problems related to data storage and management. Hongfu Liu (Computer Science) and Pengyu Hong (Computer Science) will collaborate with Rogers to develop novel machine learning techniques to analyze the data once with the GPU-cluster supported by an NSF MRI grant to Hong and Mike Hagan (Physics / Biophysics).
Jonathan Touboul (Mathematics) works on the theoretical modeling of the brain and brain processes. He also works on a predictive justice project with a Canadian research team using Human-in-the-Loop (HITL) techniques to extract text data from court documents to train a computational learning model to predict outcomes in compensation cases. The next step in the research is to train the machine to parse text data to produce the assessments.
Jytte Klausen (Politics) uses parallel methods to those used by Jonathan Touboul (Mathematics) to develop a computational learning model capable of detecting increasing radicalization leading to terrorist violence. Her research is funded by the NIJ/US DOJ, and is conducted in collaboration with computer scientists at Colorado State University.
Amber Spry (Politics) conducted a poll (~3,000 respondents) to test her hypothesis that subsets of treatment groups might have their unique response patterns. To test the possibility of project-oriented experiential learning across social science and computer science, Pengyu Hong (Computer Science) framed the analysis of the poll data as a machine learning assignment in his Artificial Intelligence class.
Charles Golden (Anthropology) is working on very large volume of geographic sensor data, which is supported by the NSF. One of the tasks is to identify man-made ancient artifacts, which is time-consuming labor-intensive and is mainly done manually by students. Machine learning techniques can be developed to speed up his research: filter out most negative areas and reduce the false positives produced by human annotators.
Other Examples
Other potential cross-disciplinary project-oriented experiential learning cases can be designed across social sciences/humanities and CS/math. For example, Alex Kaye (Near Eastern and Judaic Studies), Alejandro Trelles (Politics), Dorothy Kim (English) and Zachary Albert (Politics) are all working on text analysis and social/geographic networks analysis, which can impel faculty in Computer Science and math to identify the shared computational problems and develop novel solutions.
Computer Science is expanding its collaborations with other disciplines in big-data analysis. For example, Pengyu Hong (Computer Science) is working with Seth Fraden (Physics) and Mike Hagan (Physics / Biophysics) to develop machine learning techniques for discovering physics governing soft materials. The NSF Materials Research Science and Engineering Center funds the research. Olga Papaemmanouil (Computer Science) is collaborating with Steve Van Hooser (Biology) on a project to design data interfaces for neuroscience applications. The NIH funds the research. Pengyu Hong's (Computer Science) research on machine learning in chemistry/biochemistry is currently supported by two NIH grants.