®®®® SIIA Público

Título del libro: Wikinlp 2024 - 1st Workshop On Advancing Natural Language Processing For Wikipedia, Proceedings Of The Workshop
Título del capítulo: WikiBias as an Extrapolation Corpus for Bias Detection

Autores UNAM:
KARLA DENIA SALAS JIMENEZ; GEMMA BEL ENGUIX;
Autores externos:

Idioma:

Año de publicación:
2024
Palabras clave:

Adversarial machine learning; Computational linguistics; Logistic regression; Classical modeling; Classification tasks; Language model; Logistics regressions; Machine learning models; Probabilistics; Wikipedia; Contrastive Learning


Resumen:

This paper explores whether it is possible to train a machine learning model using Wikipedia data to detect subjectivity in sentences and generalize effectively to other domains. To achieve this, we performed experiments with the WikiBias corpus, the BABE corpus, and the CheckThat! Dataset. Various classical models for ML were tested, including Logistic Regression, SVC, and SVR, including characteristics such as Sentence Transformers similarity, probabilistic sentiment measures, and biased lexicons. Pre-trained models like DistilRoBERTa, as well as large language models like Gemma and GPT-4, were also tested for the same classification task. © 2024 Association for Computational Linguistics.


Entidades citadas de la UNAM: