DEEP SEMANTIC INTELLIGENCE FOR TWITTER SPAM DETECTION USING LATENT SEMANTIC ANALYSIS

Authors

  • Muhammad Haroon School of Computer Science and Technology, Xi’an University of Technology, Xi’an, 710048, China. Author
  • Shakeeb A. Khan Department of Computer Science & IT, University of Southern Punjab, Multan, Pakistan. Author
  • Muhammad Umair Department of Computer Science, National College of Business Administration & Economics NCBA&E, Sub-Campus Multan, Pakistan. Author
  • Muhammad Abrar Department of Computer Science & IT, University of Southern Punjab, Multan, Pakistan. Author
  • Shoaib Ali Qureshi Department of Computer Science, Hameeda Rasheed Institute of Science and Technology, Multan, Pakistan. Author

DOI:

https://doi.org/10.71146/kjmr766

Keywords:

Spam Detection, Twitter, Social Media Security, Machine Learning, Latent Semantic Analysis, Text Classification, Cyber security

Abstract

Social media platforms, particularly Twitter, have become integral to global communication, enabling users to share information instantly with large audiences. However, Twitter’s growing popularity has attracted malicious actors who spread misinformation, phishing attempts, and other spam content. This paper introduces a novel hybrid approach that combines Latent Semantic Analysis (LSA) with traditional machine learning classifiers to effectively distinguish between legitimate and spam tweets. We collected and processed over 5.5 million tweets using Twitter’s API, extracted key features using a statistically validated LSA technique, and implemented four supervised learning algorithms: Naïve Bayes, Support Vector Machine, Decision Tree, and Logistic Regression. The experiments were conducted using rigorous 10-fold cross-validation, and models were evaluated based on accuracy, precision, recall, and F1-score. Our LSA-enhanced approach demonstrated significant performance improvements over traditional methods, with the Naïve Bayes classifier achieving 96.82% accuracy, representing a 5.49% improvement over baseline techniques. Additional error analysis revealed that our approach is particularly effective at identifying evolving spam patterns involving promotional content and malicious URLs.

Downloads

Download data is not yet available.

Downloads

Published

2025-12-03

Issue

Section

Engineering and Technology

Categories

How to Cite

DEEP SEMANTIC INTELLIGENCE FOR TWITTER SPAM DETECTION USING LATENT SEMANTIC ANALYSIS. (2025). Kashf Journal of Multidisciplinary Research, 2(12), 1-23. https://doi.org/10.71146/kjmr766