首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于MapReduce的海量公交乘客OD并行推算方法
引用本文:邬群勇,苏克云,邹智杰.基于MapReduce的海量公交乘客OD并行推算方法[J].地球信息科学,2018,20(5):647-655.
作者姓名:邬群勇  苏克云  邹智杰
作者单位:1. 福州大学地理空间信息技术国家地方联合工程研究中心,福州 350002;2. 空间数据挖掘与信息共享教育部重点实验室,福州 350002
基金项目:国家自然科学基金项目(41471333);中央引导地方科技发展专项项目(2017L3012)
摘    要:公交乘客出行OD能够反映居民出行特征和出行需求,是进行公交系统评价、调度和线路优化的重要基础数据,对城市规划具有重要的实用价值。现有公交OD推算方法多适用于少量公交数据,无法直接快速地推算海量公交乘客出行OD,因此本文提出了一种基于MapReduce的海量公交乘客OD并行推算方法。首先将公交数据从关系型数据库迁移至HBase数据库;接着利用MapReduce并行计算框架,根据HBase中IC卡数据的Region数量分成多个map任务,每个map任务中Map函数计算上车站点,Reduce函数将上车站点以用户为单位进行归并输出到HDFS;然后在上车记录数据的基础上,根据HDFS存储的块数量分成多个map任务,针对每个乘客的出行记录,综合考虑出行链方法和历史相似出行行为规律实现对公交乘客下车站点较为精确的推算。最后以厦门2015年6月13日至26日的IC卡数据和公交车辆GPS数据进行实例分析,共计算出295条公交线路,16 879 661条上车记录,14 410 058条完整OD记录,占IC卡数据的78.9%,计算效率相比传统方法有较大幅度提升。结果表明:该方法不仅可以较为准确地推算公交乘客上下车站点,而且计算效率较高。

关 键 词:海量公交数据  公交OD  MapReduce  公交出行链  出行规律  
收稿时间:2017-08-10

A MapReduce-based Method for Parallel Calculation of Bus Passenger Origin and Destination from Massive Transit Data
WU Qunyong,SU Keyun,ZOU Zhijie.A MapReduce-based Method for Parallel Calculation of Bus Passenger Origin and Destination from Massive Transit Data[J].Geo-information Science,2018,20(5):647-655.
Authors:WU Qunyong  SU Keyun  ZOU Zhijie
Institution:1. National &Local Joint Engineering Research Center of Geo-spatial Information Technology, Fuzhou University, Fuzhou 350002, China;2. Key Laboratory of Spatial Data Mining & Information Sharing of MOE, Fuzhou 350002, China;
Abstract:Bus passengers' origin and destinations (OD) can truly reflect travel characteristics and demands of residents, which is an important basic data for bus system evaluation, scheduling and route optimization, with significantly practical value in urban planning. Existing OD estimation methods are mostly applied to a small amount of bus data, which cannot directly and rapidly calculate mass transit passenger OD. In order to solve these problems, a parallel method for calculation of massive transit passengers' origin and destinations based on MapReduce is investigated. Firstly, database migration tool was applied to transfer massive bus data stored in relational database to HBase. Secondly, MapReduce parallel computing framework was introduced to divide the IC card data into multiple Map tasks in the light of region numbers in HBase to calculate origins. The origins are grouped and stored into HDFS by user in the Reduce function. Thirdly, the destinations are estimated by origins in parallel which are divided into multiple Map tasks according to block numbers stored in HDFS. According to the travel record of each passenger, destinations can be accurately calculated by the means of public transit chain method and history similarity. In the end, taking IC card data and GPS bus data in Xiamen from June 13 to 26, 2015 as the example, which has 295 bus lines, 16 879 661 bus records, and 14 410 058 complete OD pairs which accounted for 78.9% of IC card data. Comparing with the traditional method, the computational efficiency has substantially improved. The results illustrate that the parallel method can not only calculate bus passenger OD accurately, but also has higher computational efficiency.
Keywords:massive transit data  public transit origin and destination  MapReduce  public transit trip chain  travel rule  
本文献已被 CNKI 等数据库收录!
点击此处可从《地球信息科学》浏览原始摘要信息
点击此处可从《地球信息科学》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号