Graduate Studies in the Arts and Sciences

Geeking Out With...Jin Zhao

June 2, 2025

Abigail Arnold | Graduate School of Arts and Sciences

Geeking Out With…is a feature in which we talk to GSAS students about their passions. You can check out past installments here.

Jin Zhao is a sixth-year PhD student in Computer Science; she also earned an MS in Computational Linguistics from Brandeis in 2020. Her research uses Natural Language Processing to examine the framing of news stories and evaluate whether it is positive, negative, or neutral. She joined Geeking Out With… to talk about her research, how she came to the topic, and its interdisciplinary implications.

This interview has been edited for clarity.

How did you become interested in the framing of news stories?

Before starting my work in Computational Linguistics, I studied sociolinguistics, and I’ve stayed interested in that field. The sociolinguist William Labov was my idol. I speak English as a second language, and his papers always stood out to me because of their passion. He spoke out for people who spoke a vernacular or different kind of English, saying that these people are not less smart than others but just speaking a different kind of language. When I got into Computational Linguistics, I thought that Natural Language Processing (NLP) was a great tool for exploring related topics. I read both English and Chinese news all the time and am struck by how people inhabit totally different realities. My parents read Chinese news, and my classmates read US news. They are all good people, are equally informed, and are reading news based on facts, but they have totally different narratives of the same events in their minds. It’s hard to get people to see things from another perspective.

How do you approach your research? What kinds of news sources do you use?

I have data from over 200 media sources from all around the world. I am intentionally choosing very contentious topics for the news stories because more obvious signals of framing are easier for the NLP model to evaluate at the starting stage. I have also chosen very narrow, specific events so far, for the purposes of training the model. As the model gets better at evaluating the data, we will expand it.

When collecting data for my research, I assumed neutrality because I collected it from very reputable news sources. But when we sat together to annotate them and see if they were positive, neutral, or negative, it turned out only five percent were actually neutral! No matter what the topic was, the rest were trying to frame it in some way. One interesting discovery was that framing techniques for discussing protests were very similar all around the world. For example, if a protest happens and eighty percent of the protestors leave in response to government action, one story will say “Eighty percent of the protestors left,” which is factual but is being used here to stress order; another will say “twenty percent of the protestors refused to leave” and try to stress rebelliousness. So the same protest can be framed as orderly demonstrations or as violence. It happens all the time.

How does using NLP help facilitate your research?

There is a lot of established research on journalism in sociolinguistics and sociology. NLP gives us a great opportunity to study this topic further because computers can read the articles in a split second. Many researchers spend a long time developing a code book for framing devices, but computers can identify the framing of articles from all over the world in a short period of time and efficiently look at newly emergent frames constantly.

Alongside my research, I am working on a website similar to the news aggregator Ground News, which tells you to what extent a news article is left-leaning, right-leaning, or neutral but is only US-focused and uses older processing tools. If my methods are successful in rapidly and accurately spotting framing devices, I will be able to create a website that shows articles from different languages and ideologies side by side. NLP and large language models create a great opportunity to show a range of perspectives, which is what I want to do.

What are some interesting things you have discovered during your research process? How has your own approach to the subject evolved?

Because of growing up speaking both Chinese and English, I was used to people saying different things in each language’s media. Still, I was surprised that only five percent of the articles were neutral! I always knew people were living in different bubbles with different languages and ideologies, but I didn’t know it was that severe.

Because my data comes from different languages, there is a translation process as the multilingual models are not as good at that yet, which poses its own barrier. For example, I am looking at the election of Putin, and some people in Chinese media call him “Putin the Emperor.” When literally translated to English, this seems like a negative description and has a dictatorial connotation, but in Chinese, it has a positive connotation of dignity. While I have found more problems we still need to solve, I am very optimistic about this method once we make our models work and deal with translation issues. If we can scale the model, we can help researchers in sociolinguistics and communication studies, and they can use it to get a better sense of what’s happening in the world and what people’s perspectives are. So what I’m looking forward to most is interdisciplinary cooperation between the fields of computer science and communication studies, because, to the best of my knowledge, this hasn’t really been done yet.

What comes next in your process?

My advisor, Bert Xue, and I are talking about working with people from the Journalism program. We have solved a lot of problems in the model – the first step was to extract the events from the articles, which turned out to be a difficult task on its own, and then we had to extract both explicit and implicit causal relations, which was also a very challenging problem for NLP. After all these steps, it’s now time to move into the question of framing. We need help from researchers with more expertise in this area, especially since framing can be very different for different types of news stories. The approach to problems like this is very different between fields – people in computer science look for an input, an output, and a way to evaluate the output, while people in social science do much more manual evaluation. So we need input from journalists and communications people. We have this powerful tool, and we want it to be accurate! This is too important to mess up.

You participated in this year’s Three Minute Thesis (3MT) competition and won the People’s Choice Award! What made you want to take part, and how did you find the experience?

People in Computer Science often like to see immediate results to a project, which my work does not have, so I sometimes feel defeated. But when I do talk to my old friends from sociolinguistics about my research, they are instantly very excited. It made me realize I need to get out more and talk to people outside my field! This year was also the perfect timing for 3MT as I am moving on to the next stage of my project and I need to bring people from other fields in. I’m doing interdisciplinary work and need to talk to people from other disciplines.

3MT was a super rewarding experience in many aspects. I got positive feedback and people believed my research matters, so that was very important for my motivation. I also got a lot of useful advice, and getting outside expertise opened my eyes, because I’m also in my own bubble. When I was rehearsing with the organizers, they asked me very useful questions to determine what I’m really passionate about, and that’s the key to communication.

When you’re not working on your research, what do you like to do?

I really like to watch comedy news, especially John Oliver and Trevor Noah. I watch Chinese and Spanish shows too, but they don’t have anything like this – it’s a very American thing.

What advice do you have for other students exploring their passions?

Believe in your own passion. My research doesn’t fit with the current trends in the field, which can be discouraging. But if one place does not fit you, it doesn’t mean that what you do doesn’t have any merit, and I did find my people in a small group in academia – I just didn’t find them right away. But when I read two articles from two different worlds, I know we need this. I just need to believe in myself. Have perseverance, find something like 3MT, and you’ll have a chance to reach more people. People will appreciate what you do and help you, but you need something to show them.