Main Article Content
Bus drivers’ behaviors play important roles in the quality and safety of this public transportation mode. By collecting and analyzing bus drivers’ behaviors, service operators could outline policies and actions needed to properly handle bus drivers’ misbehaviors to guarantee safety and good quality of services. The direct and transparency characteristics of complaint web boards make them popular among passengers to lodge service complaints. Filtering and categorizing service complaints is a complicated task and requires unnecessary resources. This work applies text extraction and machine learning techniques to automatically classification complaints of bus driver behavior and quality services into five categories: 1) bus stops 2) bus driver behavior 3) timetable and 4) services quality. The text extraction employed longest matching algorithm with domain-specific augmented dictionary. The document representation of the extracted words is in the form of word vector of TF-IDF weightings. Five prominent techniques, the J48 decision tree, Naïve Bayes, K-Nearest eighbour, Support Vector Machine and the Artificial Neural Network (ANN), were used to create classification models. The results suggested that the ANN model is superior to the
other model. The ANN model’s accuracy is as high as 90.23% with 91.9% precision, 90.2% recall and 90.5% F-measure. The text processing practice and the model could be used to implement a real-world system to automatically classify bus drivers’ behaviors.