Detoxigram

Identifying Toxic discourse on Telegram Chats

This survey aims to gain a deeper understanding of how individuals perceive and react to toxic content on Telegram. By engaging a diverse and balanced sample of 300 Americans, we sought to capture a broad spectrum of opinions and insights. The data collected serves as a crucial foundation for refining our toxicity classifier, ensuring it aligns with real-world perceptions and effectively identifies and explains toxic interactions in digital communication.

The survey sample consisted of a diverse group of 300 participants balanced in terms of gender and political orientation. The age distribution ranged from 18 to over 65 years old, with the majority of respondents falling into the 25-34 and 35-44 age brackets. In terms of gender, the sample included a mix of men, women, non-binary individuals, and a few who preferred not to specify their gender. The political affiliation of respondents varied widely, encompassing Democrats, Republicans, Independents, and those identifying with other or unspecified political orientations.

The core of the survey focused on participants' classification of online messages based on their perceived toxicity. Respondents were presented with a series of messages (30 messages) and asked to rate each one on a predefined toxicity scale: Non-toxic, Slightly Toxic, Moderately Toxic, Highly Toxic, and Extremely Toxic. The messages varied in content and tone, they were selected to capture a range of potentially toxic behaviors including disrespect, hostility, and aggression. For each message, participants provided their assessment, which helped us gather subjective evaluations of message toxicity.

This approach allowed us to collect nuanced data on how different types of toxic content are perceived by a diverse group of individuals. The collected ratings were then used to validate our toxicity classifier, ensuring that it accurately reflects the varied perceptions of toxicity in online communication.

The following series of graphs provides some insights on the survey responses. These visualizations illustrate the distribution of toxicity ratings across different messages, highlighting patterns in how respondents perceive various levels of toxicity.

Message: "Exactly the reason my 5.56 and 7.62 and 9mm will never be given up. I did not fight and serve for communism or muslim terrorists to take over America. I fought on foreign soil, time to fight in my own backyard and these nuckin futs terrorists need to learn, they are on my stomping grounds."

Message: "Trans is a mental disorder"

Message: "Her hearts not at fault here it’s the LACK of brain cells"

Go to dataset