Publications
2019
1.
Merayo-Alba, Sergio; Fidalgo, Eduardo; González-Castro, Víctor; Alaiz-Rodríguez, Rocío; Velasco-Mata, Javier
Use of natural language processing to identify inappropriate content in text Artículo de revista
En: Hybrid Artificial Intelligent Systems: 14th International Conference, HAIS 2019, León, Spain, September 4–6, 2019, Proceedings 14, pp. 254–263, 2019, (Publisher: Springer International Publishing).
Resumen | Enlaces | BibTeX | Etiquetas: deep learning, machine learning, Natural Language Processing, Text Encoders, Violent Content Detection
@article{merayo-alba_use_2019,
title = {Use of natural language processing to identify inappropriate content in text},
author = {Sergio Merayo-Alba and Eduardo Fidalgo and Víctor González-Castro and Rocío Alaiz-Rodríguez and Javier Velasco-Mata},
url = {https://link.springer.com/chapter/10.1007/978-3-030-29859-3_22},
year = {2019},
date = {2019-01-01},
journal = {Hybrid Artificial Intelligent Systems: 14th International Conference, HAIS 2019, León, Spain, September 4–6, 2019, Proceedings 14},
pages = {254–263},
abstract = {The quick development of communication through new technology media such as social networks and mobile phones has improved our lives. However, this also produces collateral problems such as the presence of insults and abusive comments. In this work, we address the problem of detecting violent content on text documents using Natural Language Processing techniques. Following an approach based on Machine Learning techniques, we have trained six models resulting from the combinations of two text encoders, Term Frequency-Inverse Document Frequency and Bag of Words, together with three classifiers: Logistic Regression, Support Vector Machines and Naïve Bayes. We have also assessed StarSpace, a Deep Learning approach proposed by Facebook and configured to use a Hit@1 accuracy. We evaluated these seven alternatives in two publicly available datasets from the Wikipedia Detox Project: Attack and Aggression. StarSpace achieved an accuracy of 0.938 and 0.937 in these datasets, respectively, being the algorithm recommended to detect violent content on text documents among the alternatives evaluated.},
note = {Publisher: Springer International Publishing},
keywords = {deep learning, machine learning, Natural Language Processing, Text Encoders, Violent Content Detection},
pubstate = {published},
tppubtype = {article}
}
The quick development of communication through new technology media such as social networks and mobile phones has improved our lives. However, this also produces collateral problems such as the presence of insults and abusive comments. In this work, we address the problem of detecting violent content on text documents using Natural Language Processing techniques. Following an approach based on Machine Learning techniques, we have trained six models resulting from the combinations of two text encoders, Term Frequency-Inverse Document Frequency and Bag of Words, together with three classifiers: Logistic Regression, Support Vector Machines and Naïve Bayes. We have also assessed StarSpace, a Deep Learning approach proposed by Facebook and configured to use a Hit@1 accuracy. We evaluated these seven alternatives in two publicly available datasets from the Wikipedia Detox Project: Attack and Aggression. StarSpace achieved an accuracy of 0.938 and 0.937 in these datasets, respectively, being the algorithm recommended to detect violent content on text documents among the alternatives evaluated.