首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于最优路径策略方法快速计算字符串编辑距离
引用本文:王远超,安俊秀,程芃森,王鹏.基于最优路径策略方法快速计算字符串编辑距离[J].成都信息工程学院学报,2014(6):616-624.
作者姓名:王远超  安俊秀  程芃森  王鹏
作者单位:成都信息工程学院软件工程学院,四川成都610225
基金项目:国家社会科学基金资助项目(12XSH019); 四川网络文化研究中心课题资助项目(WLWH13-01)
摘    要:传统编辑距离算法采用动态规划方法用一个维度大小分别为源字符串长度和目标字符串长度的二维数组保存计算过程中求得编辑距离值。这种传统求解方式在时间效率和空间效率上开销较大,限制了编辑距离算法在长字符串中地应用。针对传统方法存在的问题,经深入研究编辑距离的求解过程,发现在某个关键区域内存在一条最优路径,通过确定最优路径所在关键区域可以快速地求解两字符串之间的编辑距离值。实验表明,方法在计算两字符串之间的编辑距离与传统方法相比可以降低问题的求解规模,提高算法的时间效率和空间效率。所描述的方法同样适用于图论中使用动态规划方法求解一般问题地应用,比如最优分配问题和背包问题等。

关 键 词:计算机软件与理论  大数据技术  编辑距离  相似度  最优路径  关键区域  动态规划

Fast Computing String Edit Distance based on the Strategy of Optimal Path
WANG Yuan-chao,AN Jun-xiu,CHENG Peng-sen,WANG Peng.Fast Computing String Edit Distance based on the Strategy of Optimal Path[J].Journal of Chengdu University of Information Technology,2014(6):616-624.
Authors:WANG Yuan-chao  AN Jun-xiu  CHENG Peng-sen  WANG Peng
Institution:(College of Software Engineering,Chengdu University of Information Technology, Chengdu 610225, China)
Abstract:The traditional edit distance algorithm use a two-dimensional array to buffer the edit distance values computed in each iterated process. The computing process of the traditional method is traversing the whole array,whose dimension sizes are the length of the source string and target string respectively,in a dynamic programming style. However,as described above,the traditional way of solving problems is processing time and space consuming,which limits the application of edit distance algorithm in long strings. For the above performance problems in conventional method,and based on the characteristics of the solving process in the traditional way,an improved algorithm to compute the edit distance between two strings is proposed. That is,there exists critical area which contains an optimal path by which the ultimate edit distance can be obtained. By determining the critical area it can be faster than the traditional method to compute the edit distance between two strings. The experiment shows that the critical area method based on the strategy of optimal path way can lower the problem size,improve the efficiency of the computing process and reduce the time spent as well as save the space usage in contrast with the traditional method. Furthermore,the idea of this paper is also suitable for the general applications to solve the specific problems,such as Optimum Allocation problems and Knapsack problems etc.
Keywords:computer software and theory  big data technology  edit distance  similarity  optimal path  critical area  dynamic programming
本文献已被 CNKI 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号