Spam E-mail classification using Machine Learning techniques

  • Sharika Anjum Mondal Amity University Kolkata
  • Koustav Pal Amity University Kolkata
  • Kalyan Chatterjee Amity University Kolkata
  • Sayanti Banerjee Amity University Kolkata
Keywords: E-mail classification, Machine learning algorithms, classifier, Naïve-byes

Abstract

In today’s world, bulk of emails is received by every individual out of which many fraudulent or spam emails are also present. The task of a good email service provider is to create an algorithm so that such fraudulent or spam messages are automatically detected and then they are sent to the spam folder. In this paper, the authors proposed a novel technique by which this sorting of email can be done automatically. Using machine learning method, the authors implemented a method in which spam mail and fraudulent messages have been successfully detected and those mails have been sent to the spam folder of the mailbox. The authors, in this paper, presented the description of the algorithm along with the test results.  

Downloads

Download data is not yet available.

Author Biographies

Koustav Pal, Amity University Kolkata

B. Tech student of Electronics and Communication Engineering Department.

Kalyan Chatterjee, Amity University Kolkata

Assistant Professor in the Department of Electronics and Communication Engineering

Sayanti Banerjee, Amity University Kolkata

Assistant Professor in the Department of Electronics and Communication Engineering

References

M. N. Marsono, M. W. El-Kharashi, and F. Gebali, “Binary LNS-based naïve Bayes inference engine for spam control: Noise analysis and FPGA synthesis”, IET Computers & Digital Techniques, 2008

Muhammad N. Marsono, M. Watheq El-Kharashi, Fayez Gebali “Targeting spam control on middleboxes: Spam detection based on layer-3 e-mail content classification” Elsevier Computer Networks, 2009

Yuchun Tang, Sven Krasser, Yuanchen He, Weilai Yang, Dmitri Alperovitch ”Support Vector Machines and Random Forests Modeling for Spam Senders Behavior Analysis” IEEE GLOBECOM, 2008 International Journal of Computer Science & Information Technology (IJCSIT), Vol 3, No 1, Feb 2011 184

Guzella, T. S. and Caminhas, W. M. ”A review of machine learning approaches to Spam filtering.” Expert Syst. Appl., 2009

Wu, C. ”Behavior-based spam detection using a hybrid method of rule-based techniques and neural networks” Expert Syst., 2009

Khorsi. “An overview of content-based spam filtering techniques”, Informatica, 2007

Hao Zhang, Alexander C. Berg, Michael Maire, and Jitendra Malic. "SVM-KNN: Discriminative nearest neighbour classification for visual category recognition", IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2006

Carpinteiro, O. A. S., Lima, I., Assis, J. M. C., de Souza, A. C. Z., Moreira, E. M., & Pinheiro, C. A. M. "A neural model in anti-spam systems.", Lecture notes in computer science.Berlin, Springer, 2006

El-Sayed M. El-Alfy, Radwan E. Abdel-Aal "Using GMDH-based networks for improved spam detection and email feature analysis"Applied Soft Computing, Volume 11, Issue 1, January 2011

Li, K. and Zhong, Z., “Fast statistical spam filter by approximate classifications”, In Proceedings of the Joint international Conference on Measurement and Modeling of Computer Systems. Saint Malo, France, 2006

Cormack, Gordon. Smucker, Mark. Clarke, Charles " Efficient and effective spam filtering and re-ranking for large web datasets" Information Retrieval, Springer Netherlands. January 2011

Almeida,tiago. Almeida, Jurandy.Yamakami, Akebo " Spam filtering: how the dimensionality reduction affects the accuracy of Naive Bayes classifiers" Journal of Internet Services and Applications, Springer London , February 2011

Published
2020-02-02
How to Cite
Mondal, S. A., Pal, K., Chatterjee, K., & Banerjee, S. (2020). Spam E-mail classification using Machine Learning techniques. PREPARE@u® | General Preprint Services, 1(1). https://doi.org/10.36375/prepare_u.a66
Section
Seminar / Conference Paper