Data Science Internal Internship Lab
The Data Science Internal Internship (DSII) in Higher Education, launched by Brandeis CS faculty and researchers in 2021, began as a pilot linking undergraduate students with university administrators to explore data scientific/AI solutions for improving operations. This initial work evolved into the DSII Lab, an independent study course that takes a research-oriented approach: students investigate specific applications of data science and AI within higher education contexts—from enrollment management to campus resource optimization—and design experiments using real or synthetic data to explore timely emerging technologies. The DSII Lab treats higher education as a rich research environment, enabling students to conduct formal inquiry at the intersection of data science/AI innovation and the complex operational realities of universities. Articles and research papers describing this program, as well as student senior theses and research papers produced through the Lab, are linked below.
Research Projects in DSII Lab on Data Science/AI Application in Higher Ed
- "Minimizing Data Exposure in Higher Education LLM Applications: Evaluating the Model Context Protocol (MCP) for Preserving Privacy in Academic Advising,” by Bill Dong, ’26.
This study addresses privacy concerns in AI-powered student advising by evaluating a Model Context Protocol (MCP) implementation. Using synthetic student data, I compared a per-field tool design—which retrieves individual data fields on demand—against traditional full-record prompting approaches. The per-field design reduced sensitive data exposure by approximately two-thirds, though it yielded lower-quality responses for advising tasks requiring comprehensive student information. - “Enabling and Evaluating LLMs Specific to a Higher Ed Environment: High Data Privacy Requirements combined With Low Computational Resoures,” by Gabriel Abreu ‘25.
Beginning with the application of LLMs to supporting the university investment office, my research concluded with the design and evaluation of AI techniques for information retrieval in environments of low computational resources and high privacy requirements. - “Servers, Lobsters, and Students: Analyzing and Predicting Brandeis Buildings' Electricity Usage," by Maya Levisohn, '23.
My research establishes that machine learning, specifically random forest regression, can be used to accurately predict electricity usage for a university campus. On a larger scale, my project demonstrates the ways in which data analytics can be employed in Campus Operations.
Collaboration with "Data Science Day" 2025
Good morning. On behalf of the President and Provost, it is my honor to welcome you to Brandeis University for the 2025 Data Science Day. My name is Jeffrey Shoulson and I serve as Vice Provost for Undergraduate Affairs and Dean of Undergraduate Studies. I am delighted that we are hosting this event today. It offers us the opportunity to admire some of the very exciting work our own students are doing here at Brandeis, at both the graduate and undergraduate level. And the projects in which our students are involved align directly with the bold new direction we are taking, combining the theoretical study that is the hallmark of a liberal arts education with the practical applications that are made possible—and so innovative—by that academic work. Under the leadership and vision of Dr. Jessica Liebowitz, we have seen first-hand how students can contribute meaningfully to their own institution, helping it to be more effective and efficient, in projects that they have undertaken in partnership with a variety of offices and administrative units here at Brandeis. The experience our students gain—and the confidence it gives them—are invaluable in preparing them for their future career paths. And the model of experiential learning these projects follow is one we know can be duplicated in so many other areas of study. Please enjoy the day.
Keynote Address: Jin Zhao, PhD ’26, Computer Science
Title: "When Data Speaks: Using AI to Unlock Meaning from Large Text Collections (Summary)
This talk explores how modern AI enables data scientists to move beyond counting words to understanding meaning in large text collections. Traditional text analysis struggles with framing, narrative, and causal interpretation—cases where the same event is described differently yet remains factually true.
From an NLP perspective, the talk presents a four-stage pipeline: AI-assisted exploratory analysis, event extraction, cross-document event Conference with framing awareness, and causal relation extraction. Using fine-tuned language models, retrieval-augmented generation, and reasoning-based approaches, the pipeline converts unstructured text into structured, interpretable event and causal data.
The broader message is a shift in data science from frequency to semantics, from prediction to explanation, and from opaque models to transparent systems that reveal how meaning and narratives are constructed in text.
Poster Sessions
1) Gabriel Abreu '25, DSII Lab Alumnus: "Employing LLMs in Higher Education."
2) Bill Dong '26, DSII Lab Alumnus: "Building an Interactive Dashboard for Student Wellness Analytics."
3) Jasmine Huang '26, DSII Lab Alumnus:: "STEM Pathways: From Data Insights to Higher Ed Protocols."
4) Rowan Scassellati '26, DSII Lab Alumnus: "Customizable and Intuitive Visualizations for Complex Data: The Case of Brandeis University's Hiatt Career Center."
5) Vedanshi Shah MS '26: "Career Compass: Visual Analytics for AI-Powered Career Preparation."
6) Thomas Breimer, PhD Candidate: "A Topological Data Analysis Based Feature Engineering for Automatic Target Recognition User Interface (tda2tru)."
Acknowledgements
We would like to extend our sincere gratitude to the faculty and administrators from the Brandeis Computer Science Department who made this collaboration possible: Dylan Cashman, Antonella DiLillo, Anne Gudaitis, Tim Hickey, Jessica Liebowitz, and Nianwen Xue.
Project Partnerships Between DSII Interns and Administrative Mentors
The partnerships described below, conducted from January 2021 to January 2025, generated many of the research questions now being explored through the DSII Lab's formal inquiry approach.
Campus Planning and Operations
- Improving response to daily building alarms: DSII Intern built machine-readable dataset of alarms from all campus buildings, then created analytics to pinpoint faulty alarms, missing sensors, and disproportionately high percentages associated with specific buildings. These analytics help increase the capacity of administrators to rely on data-informed decision-making to allocate scarce staff resources to response and repair.
- Enabling geographic view of utilities infrastructure: DSII Intern designed the schema necessary to create a spatial data platform for viewing all buildings, infrastructure, and utilities components on a campus map. The prototype geographical information system includes precise details of sewer pipes, manhole covers, and electric submeters, all aligned with location-specific measures of deferred maintenance, enabling administrators to more effectively evaluate potential investments (“does it make sense to install a bathroom on this spot in this field?”) and troubleshoot building failures (“which water pipe might be responsible for that manhole spewing so much steam?”).
- Measuring and predicting electricity use: DSII Interns engineered the integration of data from multiple current and historical sources to create a machine-readable dataset of electricity usage by campus building. Interns designed and built a dashboard that allows administrators to analyze usage levels and patterns for each building across seasons (winter, spring, summer, fall) and functions (residence hall, science building, gym, etc). Interns also began building predictive tools for better estimating energy usage by building as well as to highlight relevant anomalies deserving of strategic attention.
- Transportation Logistics: DSII Intern created the data scientific platform, including regular feeds of machine-readable data, and user-friendly dashboards, to analyze and visualize ridership data of university campus shuttle service. This data visualization allows administration to identify what day and time vehicles are reaching maximum capacity, as well as identify underutilized stops and times. This work also generates general trends such as ridership by day, time, and stop as well as highlighting vendor on-time/early/late arrival performance by stop. This is transforming the capacity of administrators to more effectively and efficiently align campus transportation resources to community needs.
- Intern is transforming into machine-readable format all data related to student participation in career services activities, which will enable interactive dashboards, data visualizations, and analyses of trends across student age groups and across career service office offerings. This will support the division as it aims to increase data-informed decision-making on design and delivery of program offerings that provide the most relevant career services resources, in a targeted way, to the student population.
- Analyzing trends in course enrollments and instructor workloads: DSII Interns engineered and integrated datasets to make it possible to build dashboards of historical and current data on undergraduate course enrollments and instructor workloads. The dashboards now make it possible to disaggregate the data by academic department, instructor, undergraduate major, and individual course. Experimental analytics by Interns based on this new data platform demonstrate the possibility of predicting enrollments in required and high traffic courses, a functionality that, once formally implemented, enables administrators to more effectively plan and allocate academic resources.
- Intern is identifying multiple streams of data reconciliation that feed into the university’s budget-making, with the aim of streamlining and automating the university-wide process, and documenting the upgrades for all current and future users.
- Analyzing trends in course enrollments for 100+ level courses across the university’s graduate programs: DSII Intern will leverage the data engineering and dashboard design expertise of Interns working in the Dean of Arts and Sciences to build out a dashboard of course enrollment trends and analytics specific to graduate school courses. As with other DSII projects that introduce dashboard functionality to existing work-flows, this will require integration of multiple data sets from various sources. Accomplishment of the goals of this project will bring dramatically increased efficiencies to administrators responsible for assessing and allocating academic resources at the graduate level.
- Launching a Human Capital Management dashboard of Organizational metrics: DSII Interns began by assembling a prototype HR dashboard, then integrating the data streams and analytics necessary to create metrics showing turnover rates, time-to-fill for open positions, degree of diversity in recruitment pools, and completion rates for performance reviews. Interns also helped collect feedback from a pilot group of users across the university to improve the design and usability of the dashboard. Interns are now creating predictive analytic tools that, once formally implemented through the HR dashboard, will help senior leaders foresee and manage workforce related challenges and opportunities. In addition, current Intern developed a user-friendly encoder for HR admins to use when needing to anonymize sensitive HR data.
- Building predictive models of charitable giving and of bequests that are customized to Brandeis alumni, donors, and friends: DSII Interns brought data scientific analyses to an entire corpus of Brandeis alumni and fundraising data, including current and historical data dating back to the founding of the university in 1948, enabling fundraising admins to identify key characteristics of those most likely to make major gifts or to leave bequests to the university. Interns are relying on these initial analyses to develop tools for data-informed guidance to administrators making resource allocation decisions regarding the cultivation of potential donors.
- Increasing efficiencies in information retrieval to strengthen donor relations and development of fundraising strategy: Intern is deploying techniques from machine learning and computational linguistics to dramatically increase the speed and accuracy of searching through 400,000 contact reports of fundraising meetings with past, current, and future donors, with the aim of deepening the capacity of the Institutional Advancement office to nurture donor relations and more effectively align fundraising strategy with the donor base.
- Data Quality Analysis: Intern is designing algorithms to identify and resolve issues in data quality, including missing, incomplete, inaccurate, or duplicated data, across all university data sets. Intern will also help design new protocols and monitoring systems for improving data quality in new data entry going forward.
- Expanding data analytic client services capacity: DSII Interns develop trustworthy and reliable expertise in report generation through Workday (the university’s new cloud-based enterprise management system) which helps administrators broaden data scientific support across the university. For example, DSII Interns in ITS were instrumental in helping to roll out the Human Capital Management dashboard.
- Experimenting with potential improvements to Workday implementation: DSII Interns in ITS help specify potential solutions to data-management challenges that emerge in the Workday implementation process. This has included identifying opportunities for improvement in the Workday Student module, with technical descriptions of functionalities that need to be activated in order to better serve the Brandeis student community.
- Speeding up deployment of data management innovations: DSII Intern was able to create an automated solution to the NSF’s required annual Higher Education Research and Development (HERD) survey thanks to application of innovative data management tool Prism, which is specifically aimed at enabling the Workday ERP to effectively integrate non-native data. Full-time staff resources had not yet been available to deploy Prism for the NSF HERD survey.
- Security, Access, and Collaboration Among DSII Interns: Intern in ITS is creating new infrastructure for governance check-lists related to on-boarding and off-boarding of DSII projects, including dynamic inventory of datasets being used, software tools being deployed, and details of differentiated access being granted by administrators. This automated system, once implemented, will enable security-conscious distribution and sharing of DSII-generated tools across university divisions.
- Undergraduate Retention: Intern is integrating datasets of relevant student information necessary to evaluate key drivers of student success, as defined by retention until graduation. Intern will help design, implement, and validate the analyses to identify these drivers. This will include developing and refining a predictive model to identify variables most highly correlated with student retention and graduation rates.
Office of Investment Management
- Increasing efficiencies in information retrieval to support the investment decision process: Intern launched experimentation with varieties of language-based data scientific tools to search voluminous and multifaceted documents containing investment-relevant information. The results of this experimentation customized to the Brandeis Investment Office will be used by management to support workflow changes aimed at dramatically cutting down on the number of hours needed to identify key insights, trends, and relationships necessary for investment decision-making.
Office of the Provost
- Enabling strategic understanding of Leaves of Absence (LOA) taken by undergraduates: Intern engineered and integrated multiple sources of unstructured data into a machine-readable dataset relevant to LOA students, including reasons for the leaves, length, graduation rates, academic majors, and demographic characteristics of the LOA students. Creating a dashboard based on this dataset has now made it possible for administrators to begin analyzing patterns and implications of LOA trends for potential interventions to help improve student outcomes. Intern also automated the process for regularly integrating data and updating analytics based on information from new generations of students.
- Analyzing bias in student course evaluations: Intern created a machine-readable data analytic platform capable of identifying demographic bias driving student responses in course evaluations. This required exploration and application of appropriate statistical techniques for the complex social science analysis of biased behaviors. Preliminary results show surprisingly low to no discrimination from Brandeis students based on the gender of instructors. Results for potential race and ethnicity bias are inconclusive, so far, due to insufficient data.
- Identifying pathways of student success in undergraduate STEM majors: Intern is engineering, integrating, and analyzing data sets relevant to identifying trends in retention of undergraduates in introductory STEM courses over the past 8 years. Identifying trends and cluster analysis of multiple characteristics of undergraduates as they make their way through these courses will illuminate the most effective policies for improving STEM major retention rates.
- Visualizing trends in sponsored research and predicting likelihood that any one particular grant proposal will be funded: Thanks to the dashboard built by DSII Interns, experts in research funding at Brandeis are now able to more easily share trends with senior academic leaders about the percentage of grant proposals that are successfully funded, with break-outs by department, principal investigator (PI), faculty rank of PI, and individual granting agency. Interns are also working on predictive tools that will help make it possible for administrators to identify not only the likelihood of a proposal’s success, but also the specific characteristics of the proposal that might increase or decrease its likelihood of success.
- Identifying key features of federally funded grants that are likely to result in successful development of patents: Intern is engineering, parsing, and linking datasets of federally funded research projects together with US Patent Office data to help specify the profile of grants that result in patents. Once applied to the universe of federally funded projects at Brandeis, this exploration will help highlight those research projects on campus with translational potential.
- Creating interactive access to outcomes data and to analytics that answer administrators’ Brandeis-relevant questions from the National College Health Assessment (NCHA) survey: DSII Intern transformed NCHA data into format capable of supporting user friendly dashboard. This now makes it possible for administrators to better understand risks associated with student wellness and to design programs and interventions targeted to address these data-informed insights. DSII Intern automated the process for updating the dashboard with new results from the NCHA. For more detail on this DSII project, see section below on “Measurable Impact.”
Publications
The DSII grew out of Dr. Liebowitz’s and Professor Tim Hickey’s data analytic research on the higher education workforce.
- Exploring the Impact of Computer Science on the Future of Higher Ed (ScholarWorks) 2021.
- Details on the launch and operations of the DSII (Trusteeship Magazine) "Student Talent Helps Put Brandeis Data to Work," 2023.
- Fostering data-driven change (Brandeis Stories) 2023.
- The Internal Internship: Enabling Novel Opportunities for Undergraduate Data Science Experiential Education (Association for Computing Machinery) 2024.
Who We Are
Dr. Jessica Liebowitz, Co-founder and Co-director, DSII Lab, and Research Scientist in
Computer Science.
Timothy J. Hickey, Co-Founder and Co-Director, DSII Lab, and Professor of Computer Science.
Nianwen Xue, DSII Lab Advisor, Professor of Computer Science, and Chair, Computer
Science Department.
Antonella Di Lillo, DSII Lab Advisor, Associate Professor of Computer Science, and
Chair, Undergraduate Curriculum Committee.
-
Undergraduate Programs
- Bachelor of Arts and Bachelor of Science in Computer Science
- Academic Advising
- Learning Goals
- Requirements
- Student FAQs
- Computer Science Placement
- Study Abroad
- 5-year Bachelor's/Master's Program
- Undergraduate Awards
- Undergraduate Departmental Representatives
- Senior Honors Thesis
- Data Science Internal Internship Lab
- Master's Programs
- PhD Program
- Research and Innovation
- People
- Career Guide
- Courses
- News and Events
- Contact Us
- Home