首页 | 本学科首页   官方微博 | 高级检索  
     检索      

自然语言理解的中文地址匹配算法
引用本文:宋子辉.自然语言理解的中文地址匹配算法[J].遥感学报,2013,17(4):788-801.
作者姓名:宋子辉
作者单位:中国科学院遥感应用研究所
基金项目:国家863高技术研究发展计划基金项目(编号:2012AA12A401)
摘    要:地址匹配算法是位置服务的核心关键技术,具有广泛应用前景。在分析现有三类主要的中文地址匹配算法——要素层级匹配法、全文检索法、正则表达式法的基础上,本文提出了基于自然语言理解的中文地址匹配算法。在这个新算法中,建立了空间关系地址模型以解决中文地址抽象问题、地址库逻辑模型以解决地址信息的空间知识表达问题。新算法的完整流程包括 “预处理”、“地址解析”、“地址要素标准化”、“推理匹配”和“匹配登记”等五个环节,本文重点阐述了“地址解析”和“推理匹配”这两个重要环节,分别依据“自然语言理解”中的“中文分词”和“语义推理”原理,对用非结构化的中文自然语言来描述的中文地址进行处理,实现自然语言理解方法与地址匹配之间的结合,从而建立完整的基于自然语言理解的中文地址匹配算法。为验证该算法,开发了“中文地址智能匹配实验系统”,对河南省濮阳市人口库1000条居民地址数据进行匹配,匹配率达到了95%,准确率高于93%。

关 键 词:自然语言理解,地址匹配,地址要素,地址解析,隐马尔科夫模型
收稿时间:2012/5/16 0:00:00
修稿时间:2012/6/18 0:00:00

Address matching algorithm based on chinese natural language understanding
SONG Zihui.Address matching algorithm based on chinese natural language understanding[J].Journal of Remote Sensing,2013,17(4):788-801.
Authors:SONG Zihui
Institution:State Key Laboratory of Remote Sensing Science, Jointly Sponsored by the Institute of Remote Sensing and Digital Earth of Chinese Academy of Science and Beijing Normal University, Beijing 100101, China
Abstract:Address matching algorithm that has broad application prospects is the core and key technology for location-based services. This paper analyzes the existing three major address matching algorithms which are the level based matching algorithm, the full-text search algorithm and the regular expression algorithm. An address matching algorithm based on Chinese natural language understanding is proposed in this paper. The complete process of this new algorithm includes five parts as pretreatment, address parsing, address elements standardization, reasoning about address matching and matching registration. This paper focuses on address parsing and reasoning matching the two most important parts. The paper establishes a complete Chinese address matching algorithm based on natural language understanding. In the principle of Chinese segmentation and semantic reasoning in natural language understanding, the new algorithm achieves the goal to combine natural language understanding with address matching by processing Chinese address of unstructured format. To check the new algorithm, an address matching experimental system was developed. The matching experiment using 1000 resident addresses of Puyang city, Henan province shows that the matching rate can be 95% or more and the accuracy rate is above 93%.
Keywords:natural language understanding  address matching  address element  address parsing  Hidden Markov Model
点击此处可从《遥感学报》浏览原始摘要信息
点击此处可从《遥感学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号