dc.description.abstract |
Now weather changes dramatically all around the world. For natural ecology retention and our environment sustainable development , all countries suppose to use energy saving and carbon reduction vehicle as green transportation to become a part of city transportation. According to this concept , Public Sharing Bike was invented. After bus and MRT transportation system become mature and complete in Taiwan , there is still a certain distance between public transportation station and destination. For this last mile demand , Taipei City Government had started to implement Public Share Bike System called “YouBike” in March 2009 , hoped it can replace private vehicles , improve insufficient parking space situation and traffic congestion circumstances. On the other hand , the government also expect YouBike can extend service scope and operation time of public transportation. But the most important thing that will impact citizens intention to use public bikes is bikes shortage situations in each bike station.
The dissertation focuses on shortage situation of YouBike. Using supervised machine learning algorithm with related open data , eg:YouBike availability information, weather statics ,holiday and vacation information from government schedule builds a model to predict public bike insufficient situation. The dissertation also uses a BI tool called Tableau to visualize analysis result that staffs can make bike adjustment decision according to comprehensive dashboard information.
The experiment puts the source data into HDFS and uses Apache Spark to do pre-process. It also compares analysis results by three classification algorithms including Naïve Bayes , SVM and Random Forest provided by Apache Spark MLlib to try to get the best adjustment predictive model of YouBike. For being easy to acquire information , the study use Tableau to create a dashboard for presenting predictive result , past and current YouBike available situation.
According to the experiment result with data of 2015/11~2016/01 YouBike availability , Random Forest is the best algorithm with average 0.87 AUC when training data set has sufficient data attributes. Therefore , the dissertation suggests that operators can use Random Forest to predict bike shortage situation for improving YouBike dispatch operation. | en_US |