ABSTRACT
Word vectorization models are used to represent vocabulary in a vector space in a manner that captures semantic relationships between words. However, the state-of-the-art word vectorization models are shown to contain biases in their word embeddings…
See full abstracts, authors, references, citations & other publication information.
+
… due to ethnic prejudices and under representation in the corpora they are trained on. This paper proposes a novel sentiment sensitive, learning-based debiasing algorithm for multiclass bias mitigation. In this study, this algorithm is used for ethnic debiasing in CBOW Word2Vec models. Unlike other debiasing algorithms, this methodology accounts for the fact that not all ethnic correlations are biased and proper debiasing should also preserve unbiased ethnic information, such as cultural knowledge. Furthermore, it does not require a pre-defined, finite set of correlations to perform debiasing. Rather, models are penalized for making ethnic correlations towards non-neutral words and are allowed to make ethnic correlations towards neutral words, performing a thorough debiasing without losing ethnic knowledge. This study also proposes a new metric to evaluate bias called SMAC (Sentiment-Aware Mean Average Cosine Similarity) which accounts for sentiment in bias measurement. We train both the baseline and debiased CBOW models on the WikiCorpus. The Debiased model achieved are duction in bias by39.48% using the S-MAC metric in comparison to the baseline model.
Full Text/Reference Website: https://www.cscjournals.org/library/manuscriptinfo.php?mc=IJCL-129

AUTHORS
Mr. Aditya Vasantharao – Thomas Jefferson High School for Science and Technology, Alexandria, 22312 – United States of America
Mr. Audhav N Durai – Thomas Jefferson High School for Science and Technology, Alexandria, 22312 – United States of America
Mr. Sauman Das – Thomas Jefferson High School for Science and Technology, Alexandria, 22312 – United States of America
KEYWORDS
Natural Language Processing, Bias Mitigation, Deep Learning, Word2Vec, Sentiment Analysis.
Indexing Keywords: Sentiment Sensitive Debiasing: A Learning-Based Approach to Remove Ethnic Stereotypes in Word Embeddings, Sentiment Sensitive Debiasing, A Learning-Based Approach to Remove Ethnic Stereotypes in Word Embeddings, Sentiment Sensitive Debiasing Approach to Remove Ethnic Stereotypes in Word Embeddings, Approach to Remove Ethnic Stereotypes in Word Embeddings.
Pages: 26-35
Revised: 31-08-2022
Published: 01-10-2022Published in International Journal of Computational Linguistics (IJCL).
Volume: 13
Issue: 3
Publication Date: 01-10-2022
*Randomly selected references used in the publication “Sentiment Sensitive Debiasing: A Learning-Based Approach to Remove Ethnic Stereotypes in Word Embeddings”.
- Alhazmi, S., Black, W., & McNaught, J. (2013). Arabic SentiWordNet in relation to SentiWordNet 3.0. International Journal of Computational Linguistics (IJCL), 4(1), 1-11.
- Hube, C., Idahl, M., & Fetahu, B. (2020, January). Debiasing word embeddings from sentiment associations in names. In Proceedings of the 13th International Conference on Web Search and Data Mining (pp. 259-267).
- Hutto, C., & Gilbert, E. (2014, May). Vader: A parsimonious rule-based model for sentiment analysis of social media text. In Proceedings of the international AAAI conference on web and social media (Vol. 8, No. 1, pp. 216-225).
- Jentzsch, S., Schramowski, P., Rothkopf, C., & Kersting, K. (2019). Semantics derived automatically from language corpora contain human-like moral choices. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society (pp. 37-44).
- Kumar, V., Bhotia, T. S., & Chakraborty, T. (2020). Nurse is closer to woman than surgeon? mitigating gender-biased proximities in word embeddings. Transactions of the Association for Computational Linguistics, 8, 486-503.
- Liapakis, A., Tsiligiridis, T., Yialouris, C., & Maliappis, M. (2020). A Corpus Driven, Aspect-based Sentiment Analysis to Evaluate in Almost Real-time, a Large Volume of Online Food & Beverage Reviews. International Journal of Computational Linguistics (IJCL), 11(2), 49-60.
- Lu, K., Mardziel, P., Wu, F., Amancharla, P., & Datta, A. (2020). Gender bias in neural natural language processing. In Logic, Language, and Security (pp. 189-202). Springer, Cham.
- Manzini, T., Lim, Y. C., Tsvetkov, Y., & Black, A. W. (2019). Black is to criminal as caucasian is to police: Detecting and removing multiclass bias in word embeddings. arXiv preprint arXiv:1904.04047.
- Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
- Pennington, J., Socher, R., & Manning, C. D. (2014). Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1532-1543).
CITATIONS
Citations used in the publication “Sentiment Sensitive Debiasing: A Learning-Based Approach to Remove Ethnic Stereotypes in Word Embeddings”.
Currently there are no citations collected for this publication at scholarlyabstracts.com.
-
CONTACT US
Please feel free to us at scholarlyabstracts@gmail.com if you wish to:
- Get your journal, conference or thesis, registered with us.
- Update this publication content.