首页 | 本学科首页   官方微博 | 高级检索  
     检索      

DCAD: a Dual Clustering Algorithm for Distributed Spatial Databases
作者姓名:ZHOU  Jiaogen  GUAN  Jihong  LI  Pingxiang
作者单位:ZHOU Jiaogen GUAN Jihong LI Pingxiang ZHOU Jiaogen,State Key Laboratory of Information Engineering in Surveying,Mapping and Remote Sensing,Wuhan University,129 Luoyu Road,Wuhan 430079,China.
基金项目:Funded by the National 973 Program of China (No.2003CB415205),the National Natural Science Foundation of China (No.40523005, No.60573183, No.60373019),the Open Research Fund Program of LIESMARS (No.WKL(04)0303).
摘    要:Spatial objects have two types of attributes: geometrical attributes and non-geometrical attributes, which belong to two different attribute domains (geometrical and non-geometrical domains). Although geometrically scattered in a geometrical domain, spatial objects may be similar to each other in a non-geometrical domain. Most existing clustering algorithms group spatial datasets into different compact regions in a geometrical domain without considering the aspect of a non-geometrical domain. However, many application scenarios require clustering results in which a cluster has not only high proximity in a geometrical domain, but also high similarity in a non-geometrical domain. This means constraints are imposed on the clustering goal from both geometrical and non-geometrical domains simultaneously. Such a clustering problem is called dual clustering. As distributed clustering applications become more and more popular, it is necessary to tackle the dual clustering problem in distributed databases. The DCAD algorithm is proposed to solve this problem. DCAD consists of two levels of clus- tering: local clustering and global clustering. First, clustering is conducted at each local site with a local clustering algorithm, and the features of local clusters are extracted. Second, local features from each site are sent to a central site where global clustering is obtained based on those features. Experiments on both artificial and real spatial datasets show that DCAD is effective and efficient.

关 键 词:分布式空间信息数据库  双重聚类算法  DCAD  知识发现  数据挖掘
文章编号:1009-5020(2007)02-137-08
收稿时间:19 March 2007
修稿时间:2007-03-19

DCAD: a dual clustering algorithm for distributed spatial databases
ZHOU Jiaogen GUAN Jihong LI Pingxiang.DCAD: a dual clustering algorithm for distributed spatial databases[J].Geo-Spatial Information Science,2007,10(2):137-144.
Authors:Zhou Jiaogen  Guan Jihong  Li Pingxiang
Institution:(1) State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, 129 Luoyu Road, Wuhan, 430079, China
Abstract:Spatial objects have two types of attributes: geometrical attributes and non-geometrical attributes, which belong to two different attribute domains (geometrical and non-geometrical domains). Although geometrically scattered in a geometrical domain, spatial objects may be similar to each other in a non-geometrical domain. Most existing clustering algorithms group spatial datasets into different compact regions in a geometrical domain without considering the aspect of a non-geometrical domain. However, many application scenarios require clustering results in which a cluster has not only high proximity in a geometrical domain, but also high similarity in a non-geometrical domain. This means constraints are imposed on the clustering goal from both geometrical and non-geometrical domains simultaneously. Such a clustering problem is called dual clustering. As distributed clustering applications become more and more popular, it is necessary to tackle the dual clustering problem in distributed databases. The DCAD algorithm is proposed to solve this problem. DCAD consists of two levels of clus- tering: local clustering and global clustering. First, clustering is conducted at each local site with a local clustering algorithm, and the features of local clusters are extracted. Second, local features from each site are sent to a central site where global clustering is obtained based on those features. Experiments on both artificial and real spatial datasets show that DCAD is effective and efficient.
Keywords:distributed clustering  dual clustering  distributed spatial database
本文献已被 CNKI 维普 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号