- This event has passed.
Weekly Quantum Group Webinar: Martin Canaan Mafunda, Comparison of machine learning algorithms for automated tweet identification system
2022-10-21 @ 14:00 - 15:00
Comparison of machine learning algorithms for an automated tweet identification system
Martin Canaan Mafunda (UKZN)
Abstract: In this study, six machine learning algorithms are trained based on 3100 tweets with either pro-Zuma, anti-Zuma or neutral sentiments. The aim of the investigation is to compare the ML algorithm’s performance on tweet identification and use the findings to recommend an optimal way of solving the natural language processing task of automatically assigning labels to unseen tweets in particular or texts in general. The ML classification algorithms considered are the Support Vector Machines (SVMs), K-Nearest Neighbor (KNN), Multi-layer Perceptron (MLP), Decision trees (DT), Naive Bayes (NB) and Discriminant Analysis (DA). The highest F-score value of 79% is reported for SVM, MLP and DT algorithms, while the least F-score value of 63% is reported for the NB algorithm. Analysis of model misclassifications revealed that there is a huge window of opportunity to develop state-of-the-art tweet classification systems by combining the potential of these six classification algorithms to produce an ensemble tweet classification system. Using an ensemble model could potentially stabilize the proportion of misclassified tweets to 39% (anti-Zuma), 39% (pro-Zuma) and 22% (neutral). The findings of this investigation are significant in that they form the basis for creating innovative data mining techniques under limited resource settings, e.g. time and computational resources.
Keywords: Support Vector Machines, K-Nearest Neighbors, Multi-Layer Perceptron, Discriminant Analysis, Naive Bayes, tweet classification system, ensemble model.