Bahar Hamzehei
• Taleb Zarhesh
abstract
The increasing awareness of bias in digital texts demands effective methods for
its detection. In this study, we compare baseline transformer models against opti-
mized, fine-tuned classifiers using paired examples from the Wiki Neutrality Corpus,
which contains biased sentences and their neutralized counterparts from Wikipedia.
Our objective is to assess whether performance improvements—measured in terms
of detection accuracy and processing speed—can be achieved through targeted
fine-tuning. By adopting modern natural language processing techniques and a
contrastive learning approach inspired by recent research, we aim to verify prior
claims and provide a clear performance comparison, without delving into the fine
details of model interpretability.
outcomes