Phishing Detection using Base Classifier and Ensemble Technique

Mithilesh Kumar  Pandey; Rekha  Pal; Saurabh  Pal; Arvind Kumar  Shukla; Manish Ranjan  Pandey; Shantanu  Shahi

doi:10.17762/ijritcc.v11i11s.8164

PDF

Published: Oct 7, 2023

DOI: https://doi.org/10.17762/ijritcc.v11i11s.8164

Keywords:

Phishing, Machine learning, Ensemble, Meta-learning, Bagging, Confusion matrix

Mithilesh Kumar Pandey

Department of Computer Applications, VBS Purvanchal University, Jaunpur, India

Rekha Pal

Department of Computer Applications, VBS Purvanchal University, Jaunpur, India

Saurabh Pal

Department of Computer Applications, VBS Purvanchal University, Jaunpur, India

Arvind Kumar Shukla

School of Computer Science & Application, IFTM University, Moradabad, India

Manish Ranjan Pandey

School of Computer Science & Application, IFTM University, Moradabad, India

Shantanu Shahi

Department of Computer Science & Engineering, Ambalika Institute of Management & Technology, Lucknow, India

Abstract

Phishing attacks continue to pose a significant threat in today's digital landscape, with both individuals and organizations falling victim to these attacks on a regular basis. One of the primary methods used to carry out phishing attacks is through the use of phishing websites, which are designed to look like legitimate sites in order to trick users into giving away their personal information, including sensitive data such as credit card details and passwords. This research paper proposes a model that utilizes several benchmark classifiers, including LR, Bagging, RF, K-NN, DT, SVM, and Adaboost, to accurately identify and classify phishing websites based on accuracy, precision, recall, f1-score, and confusion matrix. Additionally, a meta-learner and stacking model were combined to identify phishing websites in existing systems. The proposed ensemble learning approach using stack-based meta-learners proved to be highly effective in identifying both legitimate and phishing websites, achieving an accuracy rate of up to 97.19%, with precision, recall, and f1 scores of 97%, 98%, and 98%, respectively. Thus, it is recommended that ensemble learning, particularly with stacking and its meta-learner variations, be implemented to detect and prevent phishing attacks and other digital cyber threats.

How to Cite

Pandey, M. K. ., Pal, R. ., Pal, S. ., Shukla, A. K. ., Pandey, M. R. ., & Shahi, S. . (2023). Phishing Detection using Base Classifier and Ensemble Technique. International Journal on Recent and Innovation Trends in Computing and Communication, 11(11s), 367–376. https://doi.org/10.17762/ijritcc.v11i11s.8164

Issue

Vol. 11 No. 11s (2023)

Section

Articles

References

Arribas-Bel, D. (2014). Accidental, open and everywhere: Emerging data sources for the understanding of cities. Applied Geography, 49, 45-53.

Thabit, F., Alhomdy, S. A. H., Alahdal, A., & Jagtap, S. B. (2020). Exploration of Security Challenges in Cloud Computing: Issues, Threats, and Attacks with their Alleviating Techniques. Journal of Information and Computational Science, 12(10).

Auerbach, S. (2008). Screening out cyberbullies: Remedies for victims on the internet playground. Cardozo L. Rev., 30, 1641.

Karsten, P., & Bateman, O. (2016). Detecting Good Public Policy Rationales for the American Rule: A Response to the Ill-Conceived Calls for Loser Pays Rules. Duke LJ, 66, 729.

Sountharrajan, S., Nivashini, M., Shandilya, S. K., Suganya, E., Bazila Banu, A., & Karthiga, M. (2020). Dynamic recognition of phishing URLs using deep learning techniques. In Advances in cyber security analytics and decision systems (pp. 27-56). Springer, Cham.

Mourtaji, Y., Bouhorma, M., Alghazzawi, D., Aldabbagh, G., & Alghamdi, A. (2021). Hybrid rule-based solution for phishing URL detection using convolutional neural network. Wireless Communications and Mobile Computing, 2021.

Salloum, S., Gaber, T., Vadera, S., & Shaalan, K. (2021). Phishing email detection using natural language processing techniques: a literature survey. Procedia Computer Science, 189, 19-28.

Ardalani, H., Vidkjær, N. H., Kryger, P., Fiehn, O., & Fomsgaard, I. S. (2021). Metabolomics unveils the influence of dietary phytochemicals on residual pesticide concentrations in honey bees. Environment International, 152, 106503.

Liu, C., Wei, H., Qiu, T., & Zhu, X. (2018). A novel web attack detection system for Internet of Things via ensemble classification. IEEE Access, 6, 64594-64606.

Li, Y., Zhang, S., Chen, Y., & Chen, J. (2021). An edge computing based anomaly detection method in IoT industrial sustainability. IEEE Transactions on Industrial Informatics, 17(3), 2053-2062.

Wang, L., Wang, C., Cai, Z., Zhang, J., & Chen, W. (2019). Location privacy challenge in mobile edge computing. IEEE Network, 33(6), 52-58.

Munezero, M., Crespi, N., & Zeadally, S. (2020). Data mining and machine learning methods for sustainable smart cities traffic classification: A survey. Sustainable Cities and Society, 53, 101973.

Antonopoulos, I., Robu, V., Couraud, B., Kirli, D., Norbu, S., Kiprakis, A., ... & Wattam, S. (2020). Artificial intelligence and machine learning approaches to energy demand-side response: A systematic review. Renewable and Sustainable Energy Reviews, 130, 109899.

Aburrous, M., & Khelifi, A. (2013, March). Phishing detection plug-in toolbar using intelligent Fuzzy-classification mining techniques. In The international conference on soft computing and software engineering [SCSE’13], San Francisco State University, San Francisco, California, USA.

Prabha , G. ., Mohan, A. ., Kumar, R. D. ., & Velrajkumar, G. . (2023). Computational Analogies of Polyvinyl Alcohol Fibres Processed Intellgent Systems with Ferrocement Slabs. International Journal of Intelligent Systems and Applications in Engineering, 11(4s), 313–321. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/2669.

Mohammad, R. M., Thabtah, F., & McCluskey, L. (2014). Predicting phishing websites based on self-structuring neural network. Neural Computing and Applications, 25(2), 443-458. doi: 10.1007/s00521-013-1491-8

Abdelhamid, N., Ayesh, A., & Thabtah, F. (2014). Phishing detection based associative classification data mining. Expert Systems with Applications, 41(13), 5948-5959.

Verma, R., & Das, A. (2017, March). What's in a url: Fast feature extraction and malicious url detection. In Proceedings of the 3rd ACM on International Workshop on Security and Privacy Analytics (IWSPA) (pp. 55-63). ACM.

Khadi, A., & Shinde, S. (2014). Detection of phishing websites using data mining techniques. International Journal of Engineering Research and Technology, 2(12), 3725-3729.

Ali, W., & Ahmed, A. A. (2019). Hybrid intelligent phishing website prediction using deep neural networks with genetic algorithm-based feature selection and weighting. IET Information Security, 13(6), 659-669.

Moh'd Iqbal, A. L., Hadi, W. E., & Alwedyan, J. (2013). Detecting Phishing Websites Using Associative Classification. Journal of Information Engineering and Applications, VoI, 3.

Vrban?i?, G., Fister Jr, I., & Podgorelec, V. (2018, June). Swarm intelligence approaches for parameter setting of deep learning neural network: case study on phishing websites classification. In Proceedings of the 8th international conference on web intelligence, mining and semantics (pp. 1-8).

Aydin, M., & Baykal, N. (2015, September). Feature extraction and classification phishing websites based on URL. In 2015 IEEE Conference on Communications and Network Security (CNS) (pp. 769-770). IEEE.

Alqahtani, M. (2019, April). Phishing websites classification using association classification (PWCAC). In 2019 International conference on computer and information sciences (ICCIS) (pp. 1-6). IEEE.

Vaithiyanathan, V., Rajeswari, K., Tajane, K., & Pitale, R. (2013). Comparison of different classification techniques using different datasets. International Journal of Advances in Engineering & Technology, 6(2), 764.

Pandey, M. K., Singh, M. K., Pal, S., & Tiwari, B. B. (2023). Prediction of phishing websites using machine learning. Spatial Information Research, 31(2), 157-166.

Abu-Nimeh, S., Nappa, D., Wang, X., & Nair, S. (2007, October). A comparison of machine learning techniques for phishing detection. In Proceedings of the anti-phishing working groups 2nd annual eCrime researchers summit (pp. 60-69).

Dedakia, M., & Mistry, K. (2015). Phishing detection using content based associative classification data mining. J. Eng. Comput. Appl. Sci, 4(7), 209-214.

Wedyan, S., & Wedyan, F. (2013). An Associative Classification Data Mining Approach for Detecting Phishing Websites. Journal of Emerging Trends in Computing and Information Sciences, 4(12).

Abikoye, O. C., Haruna, A. D., Abubakar, A., Akande, N. O., & Asani, E. O. (2019). Modified advanced encryption standard algorithm for information security. Symmetry, 11(12), 1484.

Nguyen, L. A. T., & Nguyen, H. K. (2015, May). Developing an efficient fuzzy model for phishing identification. In 2015 10th Asian Control Conference (ASCC) (pp. 1-6). IEEE.

Rahman, S. S. M. M., Rafiq, F. B., Toma, T. R., Hossain, S. S., & Biplob, K. B. B. (2020). Performance assessment of multiple machine learning classifiers for detecting the phishing URLs. In Data Engineering and Communication Technology: ICDECT 2019 (L. B. Das, S. Mukhopadhyay, & V. K. Singh, Eds.) (pp. 285-296). Springer.

Aburrous, M., Hossain, M. A., Dahal, K., & Thabtah, F. (2010, April). Predicting phishing websites using classification mining techniques with experimental case studies. In 2010 Seventh International Conference on Information Technology: New Generations (pp. 176-181). IEEE.

Law, E., & Ahn, L. V. (2011). Human computation. Synthesis lectures on artificial intelligence and machine learning, 5(3), 1-121.

Chaurasia, V., & Pal, S. (2020). Machine learning algorithms using binary classification and multi model ensemble techniques for skin diseases prediction. International Journal of Biomedical Engineering and Technology, 34(1), 57-74.

Livieris, I. E., Pintelas, E., Stavroyiannis, S., & Pintelas, P. (2020). Ensemble deep learning models for forecasting cryptocurrency time-series. Algorithms, 13(5), 121.

Ahmadi, A., Nabipour, M., Mohammadi-Ivatloo, B., Amani, A. M., Rho, S., & Piran, M. J. (2020). Long-term wind power forecasting using tree-based learning algorithms. IEEE Access, 8, 151511-151522.

Chen, C. H., Tanaka, K., Kotera, M., & Funatsu, K. (2020). Comparison and improvement of the predictability and interpretability with ensemble learning models in QSPR applications. Journal of cheminformatics, 12(1), 1-16.

Sneha, N., & Gangil, T. (2019). Analysis of diabetes mellitus for early prediction using optimal features selection. Journal of Big data, 6(1), 1-19.

Alejandro Garcia, Machine Learning for Customer Segmentation and Targeted Marketing , Machine Learning Applications Conference Proceedings, Vol 3 2023.

Wang, Y. X., Girshick, R., Hebert, M., & Hariharan, B. (2018). Low-shot learning from imaginary data. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7278-7286).

Chaurasia, V., Pandey, M. K., & Pal, S. (2022). Chronic kidney disease: a prediction and comparison of ensemble and basic classifiers performance. Human-Intelligent Systems Integration, 1-10.

Chen, K. Y., Marschall, E. A., Sovic, M. G., Fries, A. C., Gibbs, H. L., & Ludsin, S. A. (2018). assign POP: An r package for population assignment using genetic, non?genetic, or integrated data in a machine?learning framework. Methods in Ecology and Evolution, 9(2), 439-446.

Mohammad, R. M., Thabtah, F., & McCluskey, L. (2014). Predicting phishing websites based on self-structuring neural network. Neural Computing and Applications, 25(2), 443-458.

Adil, M., Javaid, N., Qasim, U., Ullah, I., Shafiq, M., & Choi, J. G. (2020). LSTM and bat-based RUSBoost approach for electricity theft detection. Applied Sciences, 10(12), 4378.

Citation Indices	All	Since 2018
Citation	5854	3996
h-index	28	23
i10-index	119	72

Year	Rate
2019	12.6%
2018	18.3%
2017	16.9%
2016	18.8%
2015	22.9%
2014	28.9%
2013	26.1%

Phishing Detection using Base Classifier and Ensemble Technique

Abstract

References

Similar Articles

Contact Us:

Auricle Global Society of Education and Research

Y-18-A, Near Sanskar Play School, Sudarshana Nagar,

Bikaner, Rajasthan (India). Pin 334003

: editor@ijritcc.org

Quick Links:

Article Sidebar

Main Article Content

Abstract

Article Details

References

Similar Articles

Contact Us:

Auricle Global Society of Education and Research

Y-18-A, Near Sanskar Play School, Sudarshana Nagar,

Bikaner, Rajasthan (India). Pin 334003

: editor@ijritcc.org

Quick Links: