A Study on Spam Detection in Twitter Based on Machine Learning

June 9, 2019
Posted by: RSIS
Category: Computer Science and Engineering

          Submission Deadline: 14th February 2025 
      
    February Issue of 2025 : 
         
Publication Fee: 30$ USD        
         Submit Now
      
         Submission Deadline: 20th February 2025  
      
     Special Issue on Education & Public Health: 
         
Publication Fee: 30$ USD        
         Submit Now
      
         Submission Deadline: 04th February 2025  
      
     Special Issue on Economics, Management, Psychology, Sociology & Communication: 
         
Publication Fee: 30$ USD        
         Submit Now

International Journal of Research and Scientific Innovation (IJRSI) | Volume VI, Issue V, May 2019 | ISSN 2321–2705

A Study on Spam Detection in Twitter Based on Machine Learning

Nazia Nusrath Ul Ain¹, Meena Kumari K S²

¹Dept. of Information Science & Engineering, ²Dept of Computer Science & Engineering
Brindavan College of Engineering

Abstract- Spam has continued to grow at a disturbing rate despite on-going reduction efforts. This has been considerably more pervasive on micro blogging websites, given their increased popularity and ease of access. One of the most prominent micro blogging website is Twitter. Every second, on average, around 6,000 tweets are tweeted on Twitter, which corresponds to over 500 million tweets per day. Spammers leverage on this popularity of platform to trap users in malicious activities by posting spam tweets. There are tools to stop spammers, but these tools can only block malicious links, however they cannot protect the user in real-time as early as possible. Researchers have applied different approaches to detect spam. In this paper, we study the different approaches, some of them are only based on user-based features or tweet-based features or tweet-text feature. Using tweet text feature helps us to identify spam tweets even if the spammer creates a new account which was not possible only with the user and tweet based features. The existing system which used tweet text feature evaluated four different machine learning algorithms namely – Support Vector Machine, Neural Network, Random Forest and Gradient Boosting [1]. In our proposed system, using cross validation techniques, the best performance was obtained using Naive Bayes Model. With Naïve Bayes Model, we are able to achieve accuracy surpassing the existing solution.

Keywords-Naïve Bayes , Random Forest, Spam, ham

I. INTRODUCTION

Internet and social media have become increasingly popular in the recent years. Often internet users spend lot of time on social media to follow the events of their interest, post their messages, share their ideas and make friends around the world. These platforms have become integral part of people’s daily lives. One such platform is twitter which rated as the most popular social network [2].
But with great possibilities come great challenges. Exponential growth of twitter also invites unwanted activities on this platform. Every second, on average, around 6,000 tweets are tweeted on Twitter, which corresponds to over 500 million tweets per day. Spammers leverage on this popularity of platform to trap users in malicious activities by posting spam tweets.

Spam Detection in Twitter

About RSIS International

Publication Method

Conference

Join Our Team

Contact Us

About RSIS International

Publication Method

Conference

Join Our Team

Contact Us

A Study on Spam Detection in Twitter Based on Machine Learning

About RSIS International

Publication Method

Conference

Join Our Team

Contact Us

About RSIS International

Publication Method

Conference

Join Our Team

Contact Us

A Study on Spam Detection in Twitter Based on Machine Learning

Track Your Paper

Enter the following details to get the information about your paper

Subscribe to Our Newsletter