A REVIEW ON DETECTING PHISHING EMAILS USING CNN AND BI-LSTM TECHNIQUES

Authors

  • Nicholas Muriuki Muriithi School of Pure and Applied Sciences, Kirinyaga University, Kerugoya, 143-10300, Kenya https://orcid.org/0009-0007-8328-7105
  • Dr.Ephantus Mwangi School of Pure and Applied Sciences, Kirinyaga University, Kerugoya, 143-10300, Kenya
  • Dr.Kennedy Malanga School of Pure and Applied Sciences, Kirinyaga University, Kerugoya, 143-10300, Kenya

DOI:

https://doi.org/10.64680/jisads.v3i2.55

Abstract

Phishing schemes have become more sophisticated, with the pretenders posing as reputable businesses and altering the URLs to acquire the attention of consumers. These tactics such as URL shortening, obfuscation, and targeting multimedia exploit more complicated mechanisms as the detection used in the process. Existing detection methods often work poorly in multilingual content and are mostly based on characters, omitting important word- and context-based cues required to effectively distinguish among formats and languages. The fact that traditional machine learning models depend on human ability to extract features hinders their performance by reducing their adaptation and real-time capacity. This work aims at assessing current approaches to phishing detection, developing an optimal phishing detection model, developing an operational solution to phishing detection, and demonstrating that the solution works by running a set of experiments. The proposed solution involves deploying countermeasures to address the time-sensitive nature of phishing attacks by enhancing real-time detection of fake URLs, particularly in email and instant messaging systems. The study shows that the Convolutional Neural Network (CNN) became the most effective algorithm with a score of 15% in the assessment, the next model was Support Vector Machine (SVM) with 13%, and the Long Short-Term Memory (LSTM) network with 10%. The bottom of the ranking went to Natural Language Processing (NLP), Logistic Regression, and the CNN variant with the input of text and images, all with 2%. The review was done from 36 articles from google scholar and 28 articles were selected to analyze the result. The CNN and Bi-LSTM hybrid model is the most effective of the models that were examined, offering the best detection performance and making it a great option for real-world phishing prevention systems.

Published

2026-01-11