首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 968 毫秒
1.
ABSTRACT

The analysis of geographically referenced data, specifically point data, is predicated on the accurate geocoding of those data. Geocoding refers to the process in which geographically referenced data (addresses, for example) are placed on a map. This process may lead to issues with positional accuracy or the inability to geocode an address. In this paper, we conduct an international investigation into the impact of the (in)ability to geocode an address on the resulting spatial pattern. We use a variety of point data sets of crime events (varying numbers of events and types of crime), a variety of areal units of analysis (varying the number and size of areal units), from a variety of countries (varying underlying administrative systems), and a locally-based spatial point pattern test to find the levels of geocoding match rates to maintain the spatial patterns of the original data when addresses are missing at random. We find that the level of geocoding success depends on the number of points and the number of areal units under analysis, but generally show that the necessary levels of geocoding success are lower than found in previous research. This finding is consistent across different national contexts.  相似文献   

2.
Geocoding is an uncertain process that associates an address or a place name with geographic coordinates. Traditionally, geocoding is performed locally on a stand-alone computer with the geocoding tools usually bundled in GIS software packages. The use of such tools requires skillful operators who know about the issues of geocoding, that is, reference databases and complicated geocoding interpolation techniques. These days, with the advancement in the Internet and Web services technologies, online geocoding provides its functionality to the Internet users with ease; thus, they are often unaware of such issues. With an increasing number of online geocoding services, which differ in their reference databases, the geocoding algorithms, and the strategy for dealing with inputs and outputs, it is crucial for the service requestors to realize the quality of the geocoded results of each service before choosing one for their applications. This is primarily because any errors associated with the geocoded addresses will be propagated to subsequent decisions, activities, modeling, and analysis. This article examines the quality of five online geocoding services: Geocoder.us, Google, MapPoint, MapQuest, and Yahoo!. The quality of each geocoding service is evaluated with three metrics: match rate, positional accuracy, and similarity. A set of addresses from the US Environmental Protection Agency (EPA) database were used as a baseline. The results were statistically analyzed with respect to different location characteristics. The outcome of this study reveals the differences among the online geocoding services on the quality of their geocoding results and it can be used as a general guideline for selecting a suitable service that matches an application's needs.  相似文献   

3.
利用在线地理编码API解决海量中文地址快速编码问题,在此基础上,利用简单的规则对编码结果进行清洗、标记,最后通过基于系统聚类与随机森林的分类优化模型,将多平台编码结果分类处理、优化。利用广州市盗窃案件地址对模型进行训练与验证,结果表明:相比未处理的编码结果,经模型优化过的编码结果整体位置误差距离减小。高德的地理编码服务有着最好的编码质量,但训练样本的高德编码误差均值仍高达590.43 m,经模型优化后,样本的编码误差均值降至173.73 m,验证样本编码误差均值由554.88 m(高德)降至180.04 m,降低了67.49%,高德90.08%的异常编码结果被清洗优化。对于训练样本与验证样本,模型优化效果相似;对于地址类型不同的案件、位于市区与市郊的案件,模型优化效果相似,说明模型具有一定普适性。该模型能够方便快捷地将海量社会经济信息转化为空间数据,提高编码精度,为地理大数据的研究提供更好的数据支持。  相似文献   

4.
Abstract

Recent developments in theory and computer software mean that it is now relatively straightforward to evaluate how attribute errors are propagated through quantitative spatial models in GIS. A major problem, however, is to estimate the errors associated with the inputs to these spatial models. A first approach is to use the root mean square error, but in many cases it is better to estimate the errors from the degree of spatial variation and the method used for mapping. It is essential to decide at an early stage whether one should use a discrete model of spatial variation (DMSV—homogeneous areas, abrupt boundaries), a continuous model (CMSV—a continuously varying regionalized variable field) or a mixture of both (MMSV—mixed model of spatial variation). Maps of predictions and prediction error standard deviations are different in all three cases, and it is crucial for error estimation which model of spatial variation is used. The choice of model has been insufficiently studied in depth, but can be based on prior information about the kinds of spatial processes and patterns that are present, or on validation results. When undetermined it is sensible to adopt the MMSV in order to bypass the rigidity of the DMSV and CMSV. These issues are explored and illustrated using data on the mean highest groundwater level in a polder area in the Netherlands.  相似文献   

5.
In many applications of Geographical Information Systems (GIS) a common task is the conversion of addresses into grid coordinates. In many countries this is usually accomplished using address range TIGER-type files in conjunction with geocoding packages within a GIS. Improvements in GIS functionality and the storage capacity of large databases mean that the spatial investigation of data at the individual address level is now commonly performed. This process relies on the accuracy of the geocoding mechanism and this paper examines this accuracy in relation to cadastral records and census tracts. Results from a study of over 20 000 addresses in Sydney, Australia, using a TIGER-type geocoding process suggest that 5-7.5% (depending on geocoding method) of addresses may be misallocated to census tracts, and more than 50% may be given coordinates within the land parcel of a different property.  相似文献   

6.
ABSTRACT

Six routing algorithms, describing how flow (and water borne material) will be routed over Digital Elevation Models, are described and compared. The performance of these algorithms is determined based on both the calculation of the contributing area and the prediction of ephemeral gullies. Three groups of routing algorithms could be identified. Both from a statistical and a spatial viewpoint these groups produce significantly different results, with a major distinction between single flow and multiple flow algorithms. Single flow algorithms cannot accommodate divergent flow and are very sensitive to small errors. Therefore, they are not acceptable for hillslopes. The flux decomposition algorithm, proposed here, seems to be preferable to other multiple flow algorithms as it is mathematically straightforward, needs only up to two neighbours and yields more realistic results for drainage lines. The implications of the routing algorithms on the prediction of ephemeral gullies seem to be somewhat counterintuitive: the single flow algorithms that, at first sight, seem to mimic the process of overland flow, do not yield optimal prediction results.  相似文献   

7.
城市网格化管理系统经过多年运行积累了大量历史事件数据, 这类事件数据在空间上呈现明显集聚分布。确定事件发生的空间分布以及衡量空间分布的集聚程度, 能够为城市管理资源的合理调配、划分提供重要的决策支持。本文应用空间点模式分析方法, 对2011 年1-8 月间武汉市江汉区城市网格化管理系统中的两类主体事件(占道经营和垃圾处理类)进行分析, 研究发现:占道经营事件的“热点”区域1-8 月总体呈减少趋势, 而垃圾处理事件的“热点”区域整体呈递增趋势;两类事件呈现明显的空间集聚, 其特征空间尺度都为1000 m左右。研究表明, 空间点模式分析方法能够为城市管理者提供一种针对城市事件空间集聚模式的直观的可视化分析手段, 以及对空间集聚程度的定量分析方法, 并可为进一步统计建模分析奠定基础。  相似文献   

8.
GIS空间索引方法述评   总被引:15,自引:1,他引:14  
地理信息系统的主要任务之一是有效地检索空间数据及快速响应不同用户的在线查询。传统的索引方法只能解决一维查询问题,无法满足地理信息系统的要求。该文介绍了GIS中具有代表性的三类空间索引方法,即基于点区域划分的索引方法、基于面区域划分的索引方法和空间实体的地址编码索引方法,并且进行了分析对比。  相似文献   

9.
Abstract

This paper offers a teaching strategy for incorporating TIGER/Line files into introductory GIS courses where IDRISI and OSUMAP are the primary software packages. TIGER/Line files present a valuable database for teaching GIS. The TIGER data structure aids in teaching concepts related to topological data structures, geocoding and address matching, and the files themselves provide an excellent database for laboratory exercises that incorporate census information along with environmental and natural resource data. Lack of support by commonly-used educational software packages to import TIGER/Line files directly has been a serious impediment in an instructional context. This paper presents software developed to convert TIGER/Line files into simple polygon vector files acceptable by IDRISI and OSUMAP for three alternative census geographic units (tracts, block groups and blocks). The resulting vector files are plotted for visual examination and graphical output. The vector files generated can also be imported into other GIS or computer mapping software packages.  相似文献   

10.
Uncertainties and errors associated with aggregation have long been recognized in the study of spatial problems. In facility location modeling, while much has been done to examine the aggregation of large datasets of discrete points, errors and uncertainties involved in aggregating continuous spatial units are not well understood. This study focuses on the effects of aggregating continuous spatial units into discrete points within the context of the location set covering problem. We propose new measures to understand and quantify errors associated with a continuous aggregation scheme. In a real-world application, the proposed methods can be used to suggest an appropriate aggregation scheme before the application of the location model. We demonstrate the concepts developed here with an empirical study of siting emergency warning sirens in the city of Dublin, OH.  相似文献   

11.
Gambling using electronic gaming machines (EGMs) has emerged as a significant public health issue. While social impact assessments are required prior to the granting of new gaming machine licenses in Australia, there are a few established techniques for estimating the spatial distribution of a venue’s clientele. To this end, we calibrated a Huff model of gambling venue catchments based on a geocoded postal survey (n = 7040). We investigated the impact of different venue attractiveness measures, distance measures, distance decay functions, levels of spatial aggregation and venue types on model fit and results. We then compared model estimates for different behavioural subgroups. Our calibrated spatial model is a significant improvement on previously published models, increasing R2 from 0.23 to 0.64. Venue catchments differ radically in size and intensity. As different population subgroups are attracted to different venues, there is no single best index of venue attractiveness applicable to all subpopulations. The calibrated Huff model represents a useful regulatory tool for predicting the extent and composition of gambling venue catchments. It may assist in decision-making with regard to new license applications and evaluating the impact of health interventions such as mandated reductions in EGM numbers. Our calibrated parameters may be used to improve model accuracy in other jurisdictions.  相似文献   

12.
Spatial aggregations of raster data based on the majority rule have been typically used in landscape ecological studies. It is generally acknowledged that (1) dominant classes increase in abundance while minor classes decrease in abundance or even disappear through aggregation processes; and (2) spatial patterns also change with aggregations. In this paper, we examined an alternative, random rule-based aggregation and its effects on cover type abundance and landscape patterns, in comparison with the majority rule-based aggregation. We aggregated a classified TM imagery (about 1.5 million ha) from 30m (4231 × 3717 pixels) incrementally to 990m resolution (132 pixels × 116 pixels). Cover type proportion, mean patch size ratio, aggregation index (AI), and fractal dimension (FD) were used to assess the effects of aggregation. To compare landscapes under different resolutions, we assumed that the landscapes were least distorted if (1) the cover type proportions and mean patch size ratios among classes were maintained, and (2) all cover types responded in the same way for a given index as aggregation levels increased. For example, distortion is introduced by aggregation if some cover types increase their AI values with increasing aggregation levels while other cover types decrease. Our study indicated that the two spatial aggregation techniques led to different results in cover type proportions and they altered spatial pattern in opposite ways. The majority rule-based aggregations caused distortions of cover type proportions and spatial patterns. Generally, the majority rule-based aggregation filtered out minor classes and produced clumped landscapes. Such landscape representations are relatively easy for interpreting and, therefore, are suitable for land managers to conceptualize spatial patterns of a study region. By contrast, the random rule-based aggregations maintained cover type proportions accurately, but tended to make spatial patterns change toward disaggregation. Overall, the measurements of landscape indices used in this study indicated that  相似文献   

13.
Abstract

Results of a simulation study of map-image rectification accuracy are reported. Sample size, spatial distribution pattern and measurement errors in a set of ground control points, and the computational algorithm employed to derive the estimate of the parameters of a least-squares bivariate map-image transformation function, are varied in order to assess the sensitivity of the procedure. Standard errors and confidence limits are derived for each of 72 cases, and it is shown that the effects of all four factors are significant. Standard errors fall rapidly as sample size increases, and rise as the control point pattern becomes more linear. Measurement error is shown to have a significant effect on both accuracy and precision. The Gram-Schmidt orthogonal polynomial algorithm performs consistently better than the Gauss-Jordan matrix inversion procedure in all circumstances.  相似文献   

14.
Decreasing population density is a current trend in the European Union, and causes a lower environmental impact on the landscape. However, besides the desirable effect on the regeneration processes of semi-natural forest ecosystems, the lack of traditional management techniques can also lead to detrimental ecological processes. In this study we investigated the land use pattern changes in a micro-region (in North-Eastern Hungary) between 1952 and 2005, based on vectorised land use data from archive aerial photos. We also evaluated the methodology of comparisons using GIS methods, fuzzy sets and landscape metrics. We found that both GIS methods and statistical analysis of landscape metrics resulted in more or less the same findings. Differences were not as relevant as was expected considering the general tendencies of the past 60 years in Hungary. The change in the annual rate of forest recovery was 0.12%; settlements extended their area by an annual rate of 3.04%, while grasslands and arable lands had a net loss in their area within the studied period (0.60% and 0.89%, respectively). The kappa index showed a smaller similarity (~60%) between these dates but the fuzzy kappa and the aggregation index, taking into account both spatial and thematic errors, gave a more reliable result (~70–80% similarity). Landscape metrics on patch and class level ensured the possibility of a detailed analysis. We arrived at a similar outcome but were able to verify all the calculations through statistical tests. With this approach we were able to reveal significant (p < 0.05) changes; however, effect sizes did not show large magnitudes. Comparing the methods of revealing landscape change, the approach of landscape metrics was the most effective approach, as it was independent of spatial errors and ensuring a multiple way of interpretation.  相似文献   

15.
A Monte Carlo approach is used to evaluate the uncertainty caused by incorporating Post Office Box (PO Box) addresses in point‐cluster detection for an environmental‐health study. Placing PO Box addresses at the centroids of postcode polygons in conventional geocoding can introduce significant error into a cluster analysis of the point data generated from them. In the restricted Monte Carlo method I presented in this paper, an address that cannot be matched to a precise location is assigned a random location within the smallest polygon believed to contain that address. These random locations are then combined with the locations of precisely matched addresses, and the resulting dataset is used for performing cluster analysis. After repeating this randomization‐and‐analysis process many times, one can use the variance in the calculated cluster evaluation statistics to estimate the uncertainty caused by the addresses that cannot be precisely matched. This method maximizes the use of the available spatial information, while also providing a quantitative estimate of the uncertainty in that utilization. The method is applied to lung‐cancer data from Grafton County, New Hampshire, USA, in which the PO Box addresses account for more than half of the address dataset. The results show that less than 50% of the detected cluster area can be considered to have high certainty.  相似文献   

16.
Abstract

Kriging is an optimal method of spatial interpolation that produces an error for each interpolated value. Block kriging is a form of kriging that computes averaged estimates over blocks (areas or volumes) within the interpolation space. If this space is sampled sparsely, and divided into blocks of a constant size, a variable estimation error is obtained for each block, with blocks near to sample points having smaller errors than blocks farther away. An alternative strategy for sparsely sampled spaces is to vary the sizes of blocks in such away that a block's interpolated value is just sufficiently different from that of an adjacent block given the errors on both blocks. This has the advantage of increasing spatial resolution in many regions, and conversely reducing it in others where maintaining a constant size of block is unjustified (hence achieving data compression). Such a variable subdivision of space can be achieved by regular recursive decomposition using a hierarchical data structure. An implementation of this alternative strategy employing a split-and-merge algorithm operating on a hierarchical data structure is discussed. The technique is illustrated using an oceanographic example involving the interpolation of satellite sea surface temperature data. Consideration is given to the problem of error propagation when combining variable resolution interpolated fields in GIS modelling operations.  相似文献   

17.
Highly technological in vitro fertilization (IVF) treatment is available at relatively few medical centers in rural United States. This research derives a spatial accessibility surface for IVF centers in a rural Midwestern state through the application of computational methods that consider spatial and non-spatial parameters to discover potentially underserved areas in the state. These methods include a modified gravity model and techniques from spatial interaction modeling. The approach develops an enhanced accessibility index that incorporates three key sociodemographic variables describing patients seeking infertility healthcare in Iowa that have been identified based on a survey of IVF care practitioners in the state. Self-organizing map techniques are used to reveal cluster locations based on the degree of match between census sociodemographic data and the expert-identified variables. The spatial accessibility surface is combined with the sociodemographic clusters to define an enhanced measure of spatial accessibility. The results suggest that while the state's IVF centers are located in tracts characterized by high spatial accessibility, at least 19% of patients travel from census tracts classed as moderate to low accessibility. This result reveals some opportunities for service improvements for these locations. Interestingly, for tracts that are characterized as having a lower patient sociodemographic match, high spatial accessibility does not appear to be a factor that improves the likelihood of patient care, at least for the variables investigated as part of this research.  相似文献   

18.
Municipal fire departments responded to approximately 53,000 intentionally-set fires annually from 2003 to 2007, according to National Fire Protection Association figures. A disproportionate amount of these fires occur in spatio-temporal clusters, making them predictable and, perhaps, preventable. The objective of this research is to evaluate how the aggregation of data across space and target types (residential, non-residential, vehicle, outdoor and other) affects daily arson forecast accuracy for several target types of arson, and the ability to leverage information quantifying the autoregressive nature of intentional firesetting. To do this, we estimate, for the city of Detroit, Michigan, competing statistical models that differ in their ability to recognize potential temporal autoregressivity in the daily count of arson fires. Spatial units vary from Census tracts, police precincts, to citywide. We find that (1) the out-of-sample performance of prospective hotspot models for arson cannot usefully exploit the autoregressive properties of arson at fine spatial scales, even though autoregression is significant in-sample, hinting at a possible bias-variance tradeoff; (2) aggregation of arson across reported targets can yield a model that differs from by-target models; (3) spatial aggregation of data tends to increase forecast accuracy of arson due partly to the ability to account for temporally dynamic firesetting; and (4) arson forecast models that recognize temporal autoregression can be used to forecast daily arson fire activity at the Citywide scale in Detroit. These results suggest a tradeoff between the collection of high resolution spatial data and the use of more sophisticated modeling techniques that explicitly account for temporal correlation.  相似文献   

19.
Abstract

Growth in the available quantities of digital geographical data has led to major problems in maintaining and integrating data from multiple sources, required by users at differing levels of generalization. Existing GIS and associated database management systems provide few facilities specifically intended for handling spatial data at multiple scales and require time consuming manual intervention to control update and retain consistency between representations. In this paper the GEODYSSEY conceptual design for a multi-scale, multiple representation spatial database is presented and the results of experimental implementation of several aspects of the design are described. Object-oriented, deductive and procedural programming techniques have been applied in several contexts: automated update software, using probabilistic reasoning; deductive query processing using explicit stored semantic and spatial relations combined with geometric data; multiresolution spatial data access methods combining poini, line, area and surface geometry; and triangulation-based generalization software that detects and resolves topological inconsistency.  相似文献   

20.
Assessing spatial autocorrelation (SA) of statistical estimates such as means is a common practice in spatial analysis and statistics. Popular SA statistics implicitly assume that the reliability of the estimates is irrelevant. Users of these SA statistics also ignore the reliability of the estimates. Using empirical and simulated data, we demonstrate that current SA statistics tend to overestimate SA when errors of the estimates are not considered. We argue that when assessing SA of estimates with error, one is essentially comparing distributions in terms of their means and standard errors. Using the concept of the Bhattacharyya coefficient, we proposed the spatial Bhattacharyya coefficient (SBC) and suggested that it should be used to evaluate the SA of estimates together with their errors. A permutation test is proposed to evaluate its significance. We concluded that the SBC more accurately and robustly reflects the magnitude of SA than traditional SA measures by incorporating errors of estimates in the evaluation. Key Words: American Community Survey, Geary ratio, Moran’s I, permutation test, spatial Bhattacharyya coefficient.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号