Inspired by Perspective API and its real-time comment moderation tools, this Nifty Assignment is about improving online conversations by implementing a decision tree data type for text classification.1
Online abuse and harassment stops people from engaging in conversation. One area of focus is the study of negative online behaviors, such as toxic comments: user-written comments that are rude, disrespectful or otherwise likely to make someone leave a discussion. Platforms struggle to effectively facilitate conversations, leading many communities to limit or completely shut down user comments. In 2018, the Conversation AI team, a research initiative founded by Jigsaw and Google (both part of Alphabet), organized a public competition called the Toxic Comment Classification Challenge to build better machine learning systems for detecting different types of of toxicity like threats, obscenity, insults, and identity-based hate.2
Toxic comment classification is a special case of a more general problem in machine learning known as text classification. Discussion forums use text classification to determine whether comments should be flagged as inappropriate. Email software uses text classification to determine whether incoming mail is sent to the inbox or filtered into the spam folder.3
jarfiles need to be distributed alongside the source code.
The simplest way to use the text classifier is through the web app. The first visit may take 30–60 seconds for the server to wake up and train the model.
The assignment specification (spec.mhtml) contains all of the information that students need to get started. To run the project locally, download the code from GitHub and implement (or stub) each required method in the
Compile and run the
Main class to compute the classifier’s training accuracy.
javac -cp ".:lib/*" Main.java && java -cp ".:lib/*" Main
Compile and run the
Server class to launch the Nifty Web App.
javac -cp ".:lib/*" Server.java && java -cp ".:lib/*" Server toxic.tsv
A JUnit 5
TextClassifierTest class is provided, though it requires a
GoodTextClassifier reference solution with a modified
Machine learning models are trained on human-selected and human-generated datasets. Such models encode and reproduce the implicit bias inherent in the datasets. This model is not meant to generalize beyond the toy training datasets. Don’t use this in a real system!
The included vectorization algorithms also encode explicit bias. The vectorization ignores all grammar and syntax, treating each occurrence of a word as independent from all other words in the text. Any usage of a word, no matter the context, is considered equally toxic or spammy.
The provided datasets contain text that may be considered profane, vulgar, or offensive. ↩
When the Conversation AI team first built toxicity models, they found that the models incorrectly learned to associate the names of frequently attacked identities with toxicity. In 2019, the Conversation AI team ran another competition about Unintended Bias in Toxicity Classification, focusing on building models that detect toxicity across a range of diverse conversations. ↩
Google Developers. Oct 1, 2018. Text classification. In Machine Learning Guides. https://developers.google.com/machine-learning/guides/text-classification ↩