Hate Speech Detection using Deep Learning and Hybrid Features

Authors

  • Shruthi P JSS Science and Technology University Mysuru Karnataka India
  • Dr. Anil Kumar K M JSS Science and Technology University Mysuru Karnataka India

Keywords:

Hybrid Features, Deep Learning, Hate Speech, Cyber-bullying, Deep Learning Features

Abstract

Automating hate speech or inappropriate text detection in social media and other internet platform is gaining a lot of interest and becoming a valuable research topic for both industry and academia in recent years. It is more important for applications to identify the disruptive contents, understand sentiment analysis, identify cyber bullying, detect flames, threats, hatred towards people or particular community or group etc. Text classification is very challenging task due to the nature and complexities with languages, especially its context, micro words, emojis, typo error and the hidden sarcasm present in the text.

We have collected and classified tweets into 3 categories as sexism, racism and none. In our proposed work, we have combined features learned from deep learning methods with the basic features like word n-grams and tweets specific syntactic features to form hybrid feature set and also focused on improving preprocessing steps to reduce the number of missing embeddings and increase the vocabulary for efficient feature learning. We have experimented with different neural networks for feature learning. Our work delivers hybrid features and appropriate preprocessing techniques required for efficient classification of the standard dataset of 16k annotated tweets related to hate speech. The combination of LSTM (Long Short Term Memory) trained on Random Embeddings for deep learning features extraction and Logistic Regression as classifier with the hybrid features is found to be the best model and outperforms the state-of-the-art methods reported in literature by substantial  improvement in F1 score.

Downloads

Download data is not yet available.

Author Biographies

Shruthi P, JSS Science and Technology University Mysuru Karnataka India

         Shruthi P is currently pursuing her M.Tech (Master of Technology) in Computer Engineering, Department of  Computer Science & Engineering, JSS Science and Technology University, Mysuru, Karnataka, India.  Prior to this, she had worked as a senior software developer  in  Schneider Electric R&D, Bengaluru, India for 5+ years. She has mainly worked on developing web applications using web technologies like ASP.NET MVC, Angular,  Nodejs, HTML5 etc. Her research interest includes Web Mining, Text Mining, Sentiment Analysis. She has worked on Machine Learning and Deep Learning Techniques using python for Sentiment (hate or abusive) Analysis, Text Classification  under the guidance of her professor  Dr. Anil Kumar K M during her tenure in MTech.

Dr. Anil Kumar K M, JSS Science and Technology University Mysuru Karnataka India

        

       Dr. Anil Kumar K.M is currently working as Associate Professor, Department of Computer Science & Engineering, JSS Science and Technology University, Mysuru, Karnataka, India. He did his post doc from Deakin University under Professor Jemal Abawajy and Ph.D. from University of Mysore under the supervision of Prof. Suresha, Chairman, DOS in Computer Science. He has teaching experience of 20 years and research experience of  12 years. His research interest includes Text mining, Sentiment Analysis, Data mining, Opinion mining, Web Mining, Data Analytics, Computer Networks, Cyber Security. He has received 5 grants from different Government and Private funding agencies for Research & Development. He has Published nearly 39 Research paper in National and International proceedings.  

Published

2021-01-08

How to Cite

Shruthi P, & Dr. Anil Kumar K M. (2021). Hate Speech Detection using Deep Learning and Hybrid Features. Inteligencia Artificial, 23(66), 97-111. Retrieved from https://journal.iberamia.org/index.php/intartif/article/view/523