Latest Faculty of Information News

Mining Twitter for big data gold

Submitted on Thursday, October 17, 2019
Jia Xu

Assistant Professor Jia Xue, shown above on a recent trip to China, calls Twitter a gold mine for real time data about domestic violence and sexual assault.

The computer in Assistant Professor Jia Xue’s Big Data for Social Justice Lab at the Faculty of Information is set to automatically download one percent of tweets produced daily around the globe. This big data, as it’s known, will eventually be mined for what it can tell Jia about sexual and intimate partner violence (IPV).

Jia’s interest in domestic violence was sparked by an incident involving a family member while she was studying law in China. Then, during an internship at the Georgetown University Law Center in Wahington DC, she participated in a research project interviewing lawyers, judges and NGO officials working to help victims of domestic violence in her native Beijing. She soon discovered that, compared to the US, China lacked both social services and laws to protect victims living with domestic and sexual violence. She also learned that women lacked awareness of the limited services that were available.

While the women’s movement in many western countries put sexual assault and domestic violence on the political agenda in the 1970s, China had no women’s movement to heighten awareness of domestic violence let alone sociologists to carry out national surveys on the prevalence of violence against women.

The turning point didn’t come until 1995 when China hosted the United Nations’ Fourth World Conference on Women in Beijing. Since then, awareness has been slowly but steadily growing, and that evolution is tracked in Jia’s work.

Cross appointed at the Faculty of Information and the Factor-Inwentash Faculty of Social Work, Jia completed her law degree at Tsinghua University and interned at China’s Supreme Court. She continued her studies with a PhD in Social Welfare at the University of Pennsylvania’s School of Social Policy & Practice and a master’s degree in Statistics from Penn’s Wharton School. She also completed a two year pre-doctoral fellowship at Harvard’s Kennedy School of Government before moving to Toronto.

Jia’s current research focuses on applying computational and big data approaches to the examination of different facets of intimate violence and sexual assault. At a time when there is an ongoing public discussion about the responsible use of big data, she also aims to inspire students to reflect on how big data can be used to promote social justice in innovative ways.

As someone who grew up with the internet, Jia is especially interested in data mining social media to see if and how it could play a role in advancing policies to protect victims of sexual violence. “Young adults are my population. I care about them and they are at high risk for dating and sexual violence. It’s the ideal age to do prevention and intervention,” she says. “And if we don’t do prevention and intervention effectively when they are young and college age, it may have a long-term effect.”

To date, she has concentrated on Twitter, which she describes as “a gold mine for data and real time data that’s constantly renewing.” It’s also free and public, and she can share the data she collects with researchers working on other topics.

Her early research revealed that discussions around sexual violence focus on victims as opposed to abusers and advocates. As with traditional media, celebrity stories have a huge impact and famous names can quickly come to dominate the conversation. Compared to discussions about sports, entertainment and politics, there is less retweeting and more in-depth discussion. “People keep replying and keep talking about it,” says Jia. This suggests Twitter would be a good place to spread prevention information, but one potential downside is that it is not younger users’ social media platform of choice.

That honour currently goes to Instagram, which Jia is keen to include it in her research. She’s also looking at apps like WeChat, which is especially popular in China. Until now the goal has been mining Twitter to understand how users talk about sexual violence, but as Jia ponders intervention strategies, she’s focusing on Instagram and how it can be data mined. “I use it every day,” she says. “It shows how young people think.”

Read more about the Faculty of Information’s Human-Centred Data Science concentration

Filed under: