首页 | 本学科首页   官方微博 | 高级检索  
     检索      

一种面向CPU/GPU异构环境的协同并行空间插值算法
引用本文:王鸿琰,关雪峰,吴华意.一种面向CPU/GPU异构环境的协同并行空间插值算法[J].武汉大学学报(信息科学版),2017,42(12):1688-1695.
作者姓名:王鸿琰  关雪峰  吴华意
作者单位:1.武汉大学测绘遥感信息工程国家重点实验室, 湖北 武汉, 430079
基金项目:国家自然科学基金41301411湖北省自然科学基金2015CFB399
摘    要:CPU/GPU异构混合系统是一种新型高性能计算平台,但现有并行空间插值算法仅依赖CPU或GPU进行加速,迫切需要研究协同并行空间插值算法以充分利用异构计算资源,进一步提升插值效率。以薄板样条函数插值为例,提出一种CPU/GPU协同并行插值算法以加速海量激光雷达(light detector & ranger,LiDAR)点云生成数字高程模型(DEM)。通过插值任务的分解与抽象封装以屏蔽底层硬件执行模式的差异性,同时在多级协同并行框架基础上设计了Greedy-SET动态调度策略,策略顾及底层硬件能力的差异性,以实现异构并行资源的充分利用和良好负载均衡。实验表明,协同并行插值算法在高性能工作站上取得19.6倍的加速比,相比单一CPU或GPU并行算法,其效率提升分别达到54%和44%,实现了高效的协同并行处理。

关 键 词:CPU/GPU异构环境    协同并行算法    空间插值    薄板样条插值函数    LiDAR点云
收稿时间:2016-03-21

A Collaborative Parallel Spatial Interpolation Algorithmon Oriented Towards the Heterogeneous CPU/GPU System
Institution:1.State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China2.Collaborative Innovation Center of Geospatial Technology, Wuhan 430079, China
Abstract:Nowadays the heterogeneous CPU/GPU systems become ubiquitous, but most of current parallel spatial interpolation algorithms exploit only one type of computation units to speedup the calculation and thus it results in parallel resources wasted. To address this problem, a collaborative parallel thin plate spline interpolation algorithm is proposed in this paper to accelerate DEM generation from massive LiDAR point clouds. In this collaborative parallel algorithm, the input point clouds are firstly decomposed into a collection of discrete blocks and encapsulated as general task objects to shield the heterogeneous execution models of different processing units. And then a special scheduling algorithm, named Greedy-SET, is also proposed to achieve better load balance based on the computing capabilities of CPU and GPU. Experimental results demonstrate that the proposed collaborative parallel algorithm can achieve the highest speedup times of approximately 19.6. The performance improvement ratios compared with pure CPU and GPU parallel algorithms are 54% and 44% respectively.
Keywords:
本文献已被 CNKI 等数据库收录!
点击此处可从《武汉大学学报(信息科学版)》浏览原始摘要信息
点击此处可从《武汉大学学报(信息科学版)》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号