基于最优路径策略方法快速计算字符串编辑距离 Fast Computing String Edit Distance based on the Strategy of Optimal Path期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

基于最优路径策略方法快速计算字符串编辑距离

引用本文：	王远超,安俊秀,程芃森,王鹏.基于最优路径策略方法快速计算字符串编辑距离[J].成都信息工程学院学报,2014(6):616-624.

作者姓名：	王远超安俊秀程芃森王鹏

作者单位：	成都信息工程学院软件工程学院,四川成都610225

基金项目：	国家社会科学基金资助项目（12XSH019）; 四川网络文化研究中心课题资助项目（WLWH13-01）

摘要：	传统编辑距离算法采用动态规划方法用一个维度大小分别为源字符串长度和目标字符串长度的二维数组保存计算过程中求得编辑距离值。这种传统求解方式在时间效率和空间效率上开销较大,限制了编辑距离算法在长字符串中地应用。针对传统方法存在的问题,经深入研究编辑距离的求解过程,发现在某个关键区域内存在一条最优路径,通过确定最优路径所在关键区域可以快速地求解两字符串之间的编辑距离值。实验表明,方法在计算两字符串之间的编辑距离与传统方法相比可以降低问题的求解规模,提高算法的时间效率和空间效率。所描述的方法同样适用于图论中使用动态规划方法求解一般问题地应用,比如最优分配问题和背包问题等。
关键词：	计算机软件与理论大数据技术编辑距离相似度最优路径关键区域动态规划
Fast Computing String Edit Distance based on the Strategy of Optimal Path

WANG Yuan-chao,AN Jun-xiu,CHENG Peng-sen,WANG Peng.Fast Computing String Edit Distance based on the Strategy of Optimal Path[J].Journal of Chengdu University of Information Technology,2014(6):616-624.

Authors:	WANG Yuan-chao AN Jun-xiu CHENG Peng-sen WANG Peng

Institution:	(College of Software Engineering,Chengdu University of Information Technology, Chengdu 610225, China)

Abstract:	The traditional edit distance algorithm use a two-dimensional array to buffer the edit distance values computed in each iterated process. The computing process of the traditional method is traversing the whole array,whose dimension sizes are the length of the source string and target string respectively,in a dynamic programming style. However,as described above,the traditional way of solving problems is processing time and space consuming,which limits the application of edit distance algorithm in long strings. For the above performance problems in conventional method,and based on the characteristics of the solving process in the traditional way,an improved algorithm to compute the edit distance between two strings is proposed. That is,there exists critical area which contains an optimal path by which the ultimate edit distance can be obtained. By determining the critical area it can be faster than the traditional method to compute the edit distance between two strings. The experiment shows that the critical area method based on the strategy of optimal path way can lower the problem size,improve the efficiency of the computing process and reduce the time spent as well as save the space usage in contrast with the traditional method. Furthermore,the idea of this paper is also suitable for the general applications to solve the specific problems,such as Optimum Allocation problems and Knapsack problems etc.

Keywords:	computer software and theory big data technology edit distance similarity optimal path critical area dynamic programming
本文献已被 CNKI 维普等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏