DEEP SEMANTIC INTELLIGENCE FOR TWITTER SPAM DETECTION USING LATENT SEMANTIC ANALYSIS
DOI:
https://doi.org/10.71146/kjmr766Keywords:
Spam Detection, Twitter, Social Media Security, Machine Learning, Latent Semantic Analysis, Text Classification, Cyber securityAbstract
Social media platforms, particularly Twitter, have become integral to global communication, enabling users to share information instantly with large audiences. However, Twitter’s growing popularity has attracted malicious actors who spread misinformation, phishing attempts, and other spam content. This paper introduces a novel hybrid approach that combines Latent Semantic Analysis (LSA) with traditional machine learning classifiers to effectively distinguish between legitimate and spam tweets. We collected and processed over 5.5 million tweets using Twitter’s API, extracted key features using a statistically validated LSA technique, and implemented four supervised learning algorithms: Naïve Bayes, Support Vector Machine, Decision Tree, and Logistic Regression. The experiments were conducted using rigorous 10-fold cross-validation, and models were evaluated based on accuracy, precision, recall, and F1-score. Our LSA-enhanced approach demonstrated significant performance improvements over traditional methods, with the Naïve Bayes classifier achieving 96.82% accuracy, representing a 5.49% improvement over baseline techniques. Additional error analysis revealed that our approach is particularly effective at identifying evolving spam patterns involving promotional content and malicious URLs.
Downloads
Downloads
Published
Issue
Section
Categories
License
Copyright (c) 2025 Muhammad Haroon, Shakeeb A. Khan, Muhammad Umair, Muhammad Abrar, Shoaib Ali Qureshi (Author)

This work is licensed under a Creative Commons Attribution 4.0 International License.
