Unsupervised Machine Learning for Bot Detection on Twitter: Generating and Selecting Features for Accurate Clustering

Authors

  • Raad Al-azawi University of Babylon
  • Safaa O. AL-mamory College of Business Informatics, University of Information Technology and Communications, Bagdad, Iraq

DOI:

https://doi.org/10.4114/intartif.vol27iss73pp142-158

Keywords:

Twitter Bot, Feature selection, Feature extraction, Unsupervised machine learning, Clustering algorithms

Abstract

Twitter is a popular social media platform that is widely used by individuals and businesses. However, it is vulnerable to bot attacks, which can have negative effects on society. Supervised machine learning techniques can detect bots but require labeled data to differentiate between human and bot users. Twitter generates a significant amount of unlabeled data, which can be expensive to label. Unsupervised machine learning techniques, specifically clustering algorithms, are crucial for managing this data and reducing computational complexity. Effective feature selection is necessary for clustering, as some features are more important than others. This study aims to enhance feature reliability, introduce new features, and reduce them to improve bot identification accuracy using clustering algorithms. The study achieved an accuracy rate of 0.99 in four clustering algorithms, including agglomerative hierarchy, k-medoids, DBSCAN, and K-means. This was accomplished by minimizing dataset dimensions and selecting essential features. By employing unsupervised machine learning techniques, Twitter can detect and mitigate bot attacks more efficiently, which can positively impact society

 

Downloads

Download data is not yet available.

Metrics

Metrics Loading ...

Author Biography

Safaa O. AL-mamory, College of Business Informatics, University of Information Technology and Communications, Bagdad, Iraq

 

 

 

Downloads

Published

2024-02-25

How to Cite

Al-azawi, R., & O. AL-mamory, S. (2024). Unsupervised Machine Learning for Bot Detection on Twitter: Generating and Selecting Features for Accurate Clustering. Inteligencia Artificial, 27(73), 142–158. https://doi.org/10.4114/intartif.vol27iss73pp142-158