Publications
2020
1.
Sánchez-Paniagua, Manuel; Fidalgo, Eduardo; González-Castro, Víctor; Alegre, Enrique
Impact of current phishing strategies in machine learning models for phishing detection Artículo de revista
En: 13th International Conference on Computational Intelligence in Security for Information Systems (CISIS), pp. 87–96, 2020.
Resumen | Enlaces | BibTeX | Etiquetas: machine learning, NLP, phishing detection, URL
@article{sanchez-paniagua_impact_2020,
title = {Impact of current phishing strategies in machine learning models for phishing detection},
author = {Manuel Sánchez-Paniagua and Eduardo Fidalgo and Víctor González-Castro and Enrique Alegre},
url = {https://link.springer.com/chapter/10.1007/978-3-030-57805-3_9},
year = {2020},
date = {2020-01-01},
journal = {13th International Conference on Computational Intelligence in Security for Information Systems (CISIS)},
pages = {87–96},
abstract = {Phishing is one of the most widespread attacks based on social engineering. The detection of Phishing using Machine Learning approaches is more robust than the blacklist-based ones, which need regular reports and updates. However, the datasets currently used for training the Supervised Learning approaches have some drawbacks. These datasets only have the landing page of legitimate domains and they do not include the login forms from the websites, which is the most common situation in a real case of Phishing. This makes the performance of Machine Learning-based models to drop, especially when they are tested using login pages.},
keywords = {machine learning, NLP, phishing detection, URL},
pubstate = {published},
tppubtype = {article}
}
Phishing is one of the most widespread attacks based on social engineering. The detection of Phishing using Machine Learning approaches is more robust than the blacklist-based ones, which need regular reports and updates. However, the datasets currently used for training the Supervised Learning approaches have some drawbacks. These datasets only have the landing page of legitimate domains and they do not include the login forms from the websites, which is the most common situation in a real case of Phishing. This makes the performance of Machine Learning-based models to drop, especially when they are tested using login pages.