基于PMVS算法的大规模数据细粒度并行优化方法 Fine-Grained Parallel Optimization of Large-Scale Data for PMVS Algorithm期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于PMVS算法的大规模数据细粒度并行优化方法

引用本文：	刘金硕,李扬眉,江庄毅,邓娟,眭海刚,Pan Jeff.基于PMVS算法的大规模数据细粒度并行优化方法[J].武汉大学学报(信息科学版),2019,44(4):608-616.

作者姓名：	刘金硕李扬眉江庄毅邓娟眭海刚 Pan Jeff

作者单位：	1.武汉大学国家网络安全学院, 湖北武汉, 430072

基金项目：	国家自然科学基金61672393国家自然科学基金U1536204

摘要：	三维多视角立体视觉算法（patch-based multi-view stereo，PMVS）以其良好的三维重建效果广泛应用于数字城市等领域，但用于大规模计算时算法的执行效率低下。针对此，提出了一种细粒度并行优化方法，从任务划分和负载均衡、主系统存储和GPU存储、通信开销等3方面加以优化；同时，设计了基于面片的PMVS算法特征提取的GPU和多线程并行改造方法，实现了CPUs_GPUs多粒度协同并行。实验结果表明，基于CPU多线程策略能实现4倍加速比，基于统一计算设备架构（compute unified device architecture，CUDA）并行策略能实现最高34倍加速比，而提出的策略在CUDA并行策略的基础上实现了30%的性能提升，可以用于其他领域大数据处理中快速调度计算资源。
关键词：	CPUs_GPUs多粒度并行 GPU并行优化 CUDA 负载均衡存储与通信优化图像处理
收稿时间：	2017-04-19
Fine-Grained Parallel Optimization of Large-Scale Data for PMVS Algorithm

Affiliation:	1.School of Cyber Science and Engineering, Wuhan University, Wuhan 430072, China2.School of Computer Science, Technical University of Munich, Munich 85748, Germany3.School of Computer Science, Wuhan University, Wuhan 430072, China4.State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China5.Department of Computing Science, University of Aberdeen, Aberdeen AB24 3FX, UK

Abstract:	We address the problem of fine-grained parallel optimization of large-scale data. Patch-based multi-view stereo (PMVS) algorithm has been widely applied to digital city and other fields because of its good three-dimensional reconstruction effect, however, its large-scale computing algorithm has a low execution efficiency. Therefore, to address the limitation, this paper proposes a fine-grained parallel optimization method, including task allocation and load-balancing; strategies of main system memory and GPU memory; the optimization of communication. We perform CPU multi-threading operation using the pthreads function library to take full advantage of the computing power of multi-core CPUs. And for GPUs, we utilize the CUDA framework while optimizing thread organization and memory access. Besides that, we propose the idea of adapting memory pool model and pipelining model to improve bandwidth availability ratio. The memory pool model reduces the impact of data resources transferring on the bus for CPUs_GPUs while waiting for resources; the pipelining model hides communication time for CPU to read data from memory. At the same time, this paper utilizes the Harris-DOG feature extraction of PMVS algorithm of sequences of images as the example to verify our optimization strategies. The experiments demonstrate that the multi-threading CPU-based strategy can achieve 4 times speed-up ratio, the highest ratio that parallel CUDA-based strategy can achieve is 34 times, and our strategy can improve the performance 30% on the basis of the parallel CUDA-based strategy. In the future, our optimization strategy can be applied to quick computing resource scheduling in big data processing of other domains.

Keywords:
本文献已被 CNKI 等数据库收录！
	点击此处可从《武汉大学学报(信息科学版)》浏览原始摘要信息
	点击此处可从《武汉大学学报(信息科学版)》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏