A brave new (virtual) world: distributed searches, relevance scoring and facets期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

A brave new (virtual) world: distributed searches, relevance scoring and facets

Authors:	Todd King Tom Narock Raymond Walker Jan Merka Steven Joy

Institution:	(1) Institute of Geophysics and Planetary Physics, University of California, Los Angeles, CA, USA;(2) Goddard Earth Science and Technology Center, University of Maryland Baltimore County, Baltimore, MD, USA;(3) NASA/Goddard Space Flight Center, Heliospheric Physics Laboratory, Greenbelt, MD, USA

Abstract:	Our ability to deal with complex systems has improved through information system research which includes improved modeling (both data and system), the use of semantics and advances in distributed computing. The past decade has seen an explosion in the amount and variety of geosciences data and the emergence of true open data repositories through which scientists can freely access this data. Those data are found in thousands of repositories located around the world. Virtual observatories have been created to address the challenge of helping scientists search those repositories to find and access the required data. This challenge is been addressed by using technologies such as the Internet (with ample connectivity and bandwidth), the Web, cheap computing power, cheap storage and standards for critical components. Many scientific disciplines are developing virtual observatories. Yet some of the most compelling science questions cross multiple domains. While semantics can provide cross domain reasoning, often the first step in answering a question is determining what resources are available which may be relevant to a topic. The topic can be expressed as simple phrases or word sequences. Using a common relevance scoring method at all locations can enable a federated search across loosely coupled providers. The results of which can be organized into facets to aid the user in selecting the most promising resources with which to pursue the scientific investigation. We describe an approach to developing and deploying relevance scoring methods and faceted results in this brave new (virtual) world. We have found that a scoring method which considers both the presence of terms and the proximity of these terms relative to the order of the terms in the query improves the assessment of relevance. We call this Term Presence-Proximity (TPP) scoring and describe a method for calculating a normalized score. TPP scoring compares favorably with other scoring approaches.

Keywords:	Relevance scoring facets virtual observatory search
本文献已被 SpringerLink 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏