Loading [MathJax]/extensions/MathMenu.js
Phishing URL Detection Via Capsule-Based Neural Network | IEEE Conference Publication | IEEE Xplore

Phishing URL Detection Via Capsule-Based Neural Network


Abstract:

As a cyber attack which leverages social engineering and other sophisticated techniques to steal sensitive information from users, phishing attack has been a critical thr...Show More

Abstract:

As a cyber attack which leverages social engineering and other sophisticated techniques to steal sensitive information from users, phishing attack has been a critical threat to cyber security for a long time. Although researchers have proposed lots of countermeasures, phishing criminals figure out circumventions eventually since such countermeasures require substantial manual feature engineering and can not detect newly emerging phishing attacks well enough, which makes developing an efficient and effective phishing detection method an urgent need. In this work, we propose a novel phishing website detection approach by detecting the Uniform Resource Locator (URL) of a website, which is proved to be an effective and efficient detection approach. To be specific, our novel capsule-based neural network mainly includes several parallel branches wherein one convolutional layer extracts shallow features from URLs and the subsequent two capsule layers generate accurate feature representations of URLs from the shallow features and discriminate the legitimacy of URLs. The final output of our approach is obtained by averaging the outputs of all branches. Extensive experiments on a validated dataset collected from the Internet demonstrate that our approach can achieve competitive performance against other state-of-the-art detection methods while maintaining a tolerable time overhead.
Date of Conference: 25-27 October 2019
Date Added to IEEE Xplore: 08 December 2019
ISBN Information:

ISSN Information:

Conference Location: Xiamen, China
References is not available for this document.

I. Introduction

According to Anti-Phishing Working Group (APWG)[1], phishing is a cyber attack that employs both social engineering and sophisticated technical subterfuge to steal users' private information like financial data. Usually, criminals use spoofed e-mails or other messages to lead users to counterfeit websites which are designed to lure users into divulging their private information like financial data. Recent decades have witnessed a dramatic growth of phishing attacks. As reported by APWG[2], the number of phishing websites detected in the first quarter 2019 was 180,768, which was up remarkably from the 138,328 seen in the fourth quarter 2018, and from the 151,014 seen in the third quarter 2018. Phishing has caused severe damage to many industries, e.g., Software-as-a-Service (SaaS) and webmail services, payment, financial institution, etc. According to the Federal Bureau of Investigation (FBI)‘s latest report[3], there was a 136% increase in identified exposed losses from December 2016 to May 2018, and the loss due to phishing attacks has reached 12.5 billion dollars worldwide.

Select All
1.
[online] Available: https://www.antiphishing.org/.
2.
G. Aaron, Phishing activity trends report, 1st quarter 2019, [online] Available: http://docs.apwg.org/reports/apwgtrendsreportq12019.pdf.
3.
Business e-mail compromise the 12 billion dollar scam., [online] Available: https://www.ic3.gov/media/2018/180712.aspx.
4.
Z. Dou, I. Khalil, A. Khreishah, A. Al-Fuqaha and M. Guizani, "Systematization of knowledge (sok): A systematic review of softwarebased web phishing detection", IEEE Communications Surveys & Tutorials, vol. 19, no. 4, pp. 2797-2819, 2017.
5.
A. C. Bahnsen, E. C. Bohorquez, S. Villegas, J. Vargas and F. A. Gonzalez, "Classifying phishing urls using recurrent neural networks", Electronic Crime Research (eCrime) 2017 APWG Symposium on. IEEE, pp. 1-8, 2017.
6.
H. Le, Q. Pham, D. Sahoo and S. C. Hoi, Urlnet: Learning a url representation with deep learning for malicious url detection, 2018.
7.
P. Yang, G. Zhao and P. Zeng, "Phishing website detection based on multidimensional features driven by deep learning", IEEE Access, vol. 7, pp. 15196-15209, 2019.
8.
S. Sabour, N. Frosst and G. E. Hinton, "Dynamic routing between capsules", Advances in neural information processing systems, pp. 3856-3866, 2017.
9.
Y. Kim, Convolutional neural networks for sentence classification, 2014.
10.
X. Zhang, J. Zhao and Y. LeCun, "Character-level convolutional networks for text classification", Advances in neural information processing systems, pp. 649-657, 2015.
11.
[online] Available: http://www.phishtank.com/.
12.
[online] Available: https://openphish.com/.
13.
[online] Available: https://www.alexa.com/.
14.
J. A. Hanley and B. J. McNeil, "A method of comparing the areas under receiver operating characteristic curves derived from the same cases", Radiology, vol. 148, no. 3, pp. 839-843, 1983.
15.
Y. Wang, R. Agrawal and B.-Y. Choi, "Light weight anti-phishing with user whitelisting in a web browser", Region 5 Conference 2008 IEEE., pp. 1-4, 2008.
16.
[online] Available: https://developers.google.com/safe-browsing/.
17.
P. Prakash, M. Kumar, R. R. Kompella and M. Gupta, "Phishnet: predictive blacklisting to detect phishing attacks", INFOCOM 2010 Proceedings IEEE. Citeseer, pp. 1-5, 2010.
18.
Y. Zhang, J. I. Hong and L. F. Cranor, "Cantina: a content-based approach to detecting phishing web sites", Proceedings of the 16th international conference on World Wide Web., pp. 639-648, 2007.
19.
G. Xiang, J. Hong, C. P. Rose and L. Cranor, "Cantina+: A featurerich machine learning framework for detecting phishing web sites", ACM Transactions on Information and System Security (TISSEC), vol. 14, no. 2, pp. 21, 2011.
20.
Q. Cui, G.-V. Jourdan, G. V. Bochmann, R. Couturier and I.-V. Onut, "Tracking phishing attacks over time", Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, pp. 667-676, 2017.
21.
W. Zhang, Y.-X. Ding, Y. Tang and B. Zhao, "Malicious web page detection based on on-line learning algorithm", Machine Learning and Cybernetics (ICMLC) 2011 International Conference on, vol. 4, pp. 1914-1919, 2011.
22.
O. K. Sahingoz, E. Buber, O. Demir and B. Diri, "Machine learning based phishing detection from urls", Expert Systems with Applications, vol. 117, pp. 345-357, 2019.
23.
J. Saxe and K. Berlin, expose: A character-level convolutional neural network with embeddings for detecting malicious urls file paths and registry keys, 2017.
24.
B. Athiwaratkun and J. W. Stokes, "Malware classification with lstm and gru language models and a character-level cnn", Acoustics Speech and Signal Processing (ICASSP) 2017 IEEE International Conference on. IEEE, pp. 2482-2486, 2017.
25.
M. Nguyen, T. Nguyen and T. H. Nguyen, A deep learning model with hierarchical lstms and supervised attention for anti-phishing, 2018.
26.
J. Saxe, R. Harang, C. Wild and H. Sanders, A deep learning approach to fast format-agnostic detection of malicious web content, 2018.
27.
[online] Available: https://www.virustotal.com/.
28.
[online] Available: https://commoncrawl.org/.
29.
J. Ma, L. K. Saul, S. Savage and G. M. Voelker, "Beyond blacklists: learning to detect malicious web sites from suspicious urls", Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining., pp. 1245-1254, 2009.
30.
Scikit-learn: Machine Learning in Python Pedregosa JMLR 12, pp. 2825-2830, 2011.
Contact IEEE to Subscribe

References

References is not available for this document.