This paper proposes an advanced tree-based ensemble algorithm for public transportation usage rate prediction.
Authors
Gorkem Sariyer, Department of Business Administration, Yasar University, Izmir, Turkey.
Sachin Kumar Mangla, Professor, Jindal Global Business School, O.P. Jindal Global University, Sonipat, Haryana, India; Knowledge Management & Business Decision Making, University of Plymouth, UK.
Mert Erkan Sozen, Business Development Chief, Izmir Metro Company, Izmir, Turkey.
Guo Li, School of Management, Beijing Institute of Technology, Beijing, China; Center for Energy and Environmental Policy Research, Beijing Institute of Technology, Beijing, China.
Yigit Kazancoglu, Department of Logistics Management, Yasar University, Izmir, Turkey.
Summary
Public transportation usage prediction is valuable for the sustainable development of transportation systems, particularly in crowded megacities. Machine learning technologies are of great interest for predicting public transportation usage. While these technologies outperform many other techniques, they suffer from limited interpretability. Explainable artificial intelligence (XAI) tools and techniques that offer post-hoc explanations of the obtained predictions are gaining popularity.
This paper proposes an advanced tree-based ensemble algorithm for public transportation usage rate prediction. We aim to explain the predictions both with the most widely used technique of XAI, Shapley additive explanation (SHAP) and in the light of the rules presented.
To predict the total public transportation usage, the proposed model combines all types of public transportation, categorized as ferry, railway, and bus, unlike most existing studies focusing on a single kind of public transport. Besides the sort of transportation, the day of the week, whether the day is special, and the daily ratio of passenger types were identified as model features for predicting the daily usage of each type of public transportation.
We tested the proposed model using an open data set from Izmir City, Turkey. While the model had superior prediction performance, the explanations showed that the type of public transportation, weekday, and the ratio of full-fare passengers have the highest SHAP values, and the model features have many interactions. We also validated our results using an online data set showing Google search trends.
Published in: Omega (United Kingdom)
To read the full article, please click here.