Publications
2022
1.
Sánchez-Paniagua, Manuel; Fidalgo, Eduardo; Alegre, Enrique; Alaiz-Rodríguez, Rocío
Phishing websites detection using a novel multipurpose dataset and web technologies features Artículo de revista
En: Expert Systems with Applications, vol. 207, pp. 118010, 2022, (Publisher: Pergamon).
Resumen | Enlaces | BibTeX | Etiquetas: Dataset Creation, LightGBM Classifier, phishing detection, Web Technology Features
@article{sanchez-paniagua_phishing_2022,
title = {Phishing websites detection using a novel multipurpose dataset and web technologies features},
author = {Manuel Sánchez-Paniagua and Eduardo Fidalgo and Enrique Alegre and Rocío Alaiz-Rodríguez},
url = {https://www.sciencedirect.com/science/article/pii/S0957417422012301},
year = {2022},
date = {2022-01-01},
journal = {Expert Systems with Applications},
volume = {207},
pages = {118010},
abstract = {Phishing attacks are a major challenge in cybersecurity, often involving the hijacking of sensitive data through fraudulent login forms. This paper proposes a new methodology for detecting phishing websites in real-world scenarios using URL, HTML, and web technology features. The authors introduce the Phishing Index Login Websites Dataset (PILWD), an offline dataset containing 134,000 verified samples, which enables researchers to test and compare detection approaches. Using the dataset, a LightGBM classifier with 54 features achieves a 97.95% accuracy in detecting phishing websites. This methodology is independent of third-party services and utilizes new features for improved detection.},
note = {Publisher: Pergamon},
keywords = {Dataset Creation, LightGBM Classifier, phishing detection, Web Technology Features},
pubstate = {published},
tppubtype = {article}
}
Phishing attacks are a major challenge in cybersecurity, often involving the hijacking of sensitive data through fraudulent login forms. This paper proposes a new methodology for detecting phishing websites in real-world scenarios using URL, HTML, and web technology features. The authors introduce the Phishing Index Login Websites Dataset (PILWD), an offline dataset containing 134,000 verified samples, which enables researchers to test and compare detection approaches. Using the dataset, a LightGBM classifier with 54 features achieves a 97.95% accuracy in detecting phishing websites. This methodology is independent of third-party services and utilizes new features for improved detection.