首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
2.
利用在线地理编码API解决海量中文地址快速编码问题,在此基础上,利用简单的规则对编码结果进行清洗、标记,最后通过基于系统聚类与随机森林的分类优化模型,将多平台编码结果分类处理、优化。利用广州市盗窃案件地址对模型进行训练与验证,结果表明:相比未处理的编码结果,经模型优化过的编码结果整体位置误差距离减小。高德的地理编码服务有着最好的编码质量,但训练样本的高德编码误差均值仍高达590.43 m,经模型优化后,样本的编码误差均值降至173.73 m,验证样本编码误差均值由554.88 m(高德)降至180.04 m,降低了67.49%,高德90.08%的异常编码结果被清洗优化。对于训练样本与验证样本,模型优化效果相似;对于地址类型不同的案件、位于市区与市郊的案件,模型优化效果相似,说明模型具有一定普适性。该模型能够方便快捷地将海量社会经济信息转化为空间数据,提高编码精度,为地理大数据的研究提供更好的数据支持。  相似文献   

3.
In many applications of Geographical Information Systems (GIS) a common task is the conversion of addresses into grid coordinates. In many countries this is usually accomplished using address range TIGER-type files in conjunction with geocoding packages within a GIS. Improvements in GIS functionality and the storage capacity of large databases mean that the spatial investigation of data at the individual address level is now commonly performed. This process relies on the accuracy of the geocoding mechanism and this paper examines this accuracy in relation to cadastral records and census tracts. Results from a study of over 20 000 addresses in Sydney, Australia, using a TIGER-type geocoding process suggest that 5-7.5% (depending on geocoding method) of addresses may be misallocated to census tracts, and more than 50% may be given coordinates within the land parcel of a different property.  相似文献   

4.
A Monte Carlo approach is used to evaluate the uncertainty caused by incorporating Post Office Box (PO Box) addresses in point‐cluster detection for an environmental‐health study. Placing PO Box addresses at the centroids of postcode polygons in conventional geocoding can introduce significant error into a cluster analysis of the point data generated from them. In the restricted Monte Carlo method I presented in this paper, an address that cannot be matched to a precise location is assigned a random location within the smallest polygon believed to contain that address. These random locations are then combined with the locations of precisely matched addresses, and the resulting dataset is used for performing cluster analysis. After repeating this randomization‐and‐analysis process many times, one can use the variance in the calculated cluster evaluation statistics to estimate the uncertainty caused by the addresses that cannot be precisely matched. This method maximizes the use of the available spatial information, while also providing a quantitative estimate of the uncertainty in that utilization. The method is applied to lung‐cancer data from Grafton County, New Hampshire, USA, in which the PO Box addresses account for more than half of the address dataset. The results show that less than 50% of the detected cluster area can be considered to have high certainty.  相似文献   

5.
Geocoding is an uncertain process that associates an address or a place name with geographic coordinates. Traditionally, geocoding is performed locally on a stand-alone computer with the geocoding tools usually bundled in GIS software packages. The use of such tools requires skillful operators who know about the issues of geocoding, that is, reference databases and complicated geocoding interpolation techniques. These days, with the advancement in the Internet and Web services technologies, online geocoding provides its functionality to the Internet users with ease; thus, they are often unaware of such issues. With an increasing number of online geocoding services, which differ in their reference databases, the geocoding algorithms, and the strategy for dealing with inputs and outputs, it is crucial for the service requestors to realize the quality of the geocoded results of each service before choosing one for their applications. This is primarily because any errors associated with the geocoded addresses will be propagated to subsequent decisions, activities, modeling, and analysis. This article examines the quality of five online geocoding services: Geocoder.us, Google, MapPoint, MapQuest, and Yahoo!. The quality of each geocoding service is evaluated with three metrics: match rate, positional accuracy, and similarity. A set of addresses from the US Environmental Protection Agency (EPA) database were used as a baseline. The results were statistically analyzed with respect to different location characteristics. The outcome of this study reveals the differences among the online geocoding services on the quality of their geocoding results and it can be used as a general guideline for selecting a suitable service that matches an application's needs.  相似文献   

6.
7.
Summary This paper addresses the digital dissemination of geographically referenced census information in the UK. A number of important weaknesses in the 1991 model of data access are identified, and the possibility of future access to census information via the World Wide Web is then addressed in detail. Two case studies demonstrate the potential to overcome some fundamental weaknesses in earlier access models, including the provision of integrated data and metadata, graphical interfaces to geographical datasets, and an integrated interface and analysis environment.  相似文献   

8.
张青年 《地理研究》2001,20(5):629-636
在地理信息系统(GIS)中,点、线、面等基本图形要素不仅直接表示了各种地理现象,而且隐含地表示了地理现象的空间结构,例如斑块结构、棋盘结构。在许多GIS应用中,需要对图形数据进行概括处理,以派生出较小比例尺的数据集或地图。由于地理现象的空间结构是地理规律和地理景观格局的重要反映,因此需要识别数据库中隐含的空间结构,并在概括后的数据库中有意识地反映这种空间结构。  相似文献   

9.
Quantitative research of urban geography has benefited greatly from the rapid development of big geo-data. Spatial assembly is an essential analytical step to summarize and perceive geographical environment from individual behaviours. Most research focuses on the methodology of how to utilize the big data, while the adopted spatial units for data aggregation remain areal in nature. This article conceptually proposes an idea of sensing cities from a street perspective, emphasizes the significance of street units in quantitative urban studies. Using a three-month taxi trajectory dataset and the major streets in Beijing, we explore the spatio-temporal patterns of urban mobility on streets, cluster streets into nine types based on their dynamic functions and capacities. Additionally, we discuss the differences and connections between the linear street unit and traditional areal units, investigate the possibility of uncovering urban communities using streets, and point out the complexity of streets. We conclude that street unit as a supplement to areal units, is able to effectively minify the modifiable areal unit problem (MAUP), sense urban dynamics, depict urban functions, and understand urban structures.  相似文献   

10.
The proliferation of geographic information systems and point data has made the analysis of spatial point patterns of increasing interest in a variety of disciplines. Though early forms of spatial point pattern analysis were limited in their scope, current forms have been developed that provide significant insight into underlying data generating processes. This paper builds on the spatial point pattern analysis literature through the development of a nonparametric Monte Carlo spatial point pattern test (and corresponding index) to measure the degree of similarity between two spatial point patterns. The applicability of this new test is then shown using crime data.  相似文献   

11.
Police databases hold a large amount of crime data that could be used to inform us about current and future crime trends and patterns. Predictive analysis aims to optimize the use of these data to anticipate criminal events. It utilizes specific statistical methods to predict the likelihood of new crime events at small spatiotemporal units of analysis. The aim of this study is to investigate the potential of applying predictive analysis in an urban context. To this end, the available crime data for three types of crime (home burglary, street robbery, and battery) are spatially aggregated to grids of 200 by 200 m and retrospectively analyzed. An ensemble model is applied, synthesizing the results of a logistic regression and neural network model, resulting in bi-weekly predictions for 2014, based on crime data from the previous three years. Temporally disaggregated (day versus night predictions) monthly predictions are also made. The quality of the predictions is evaluated based on the following criteria: direct hit rate (proportion of incidents correctly predicted), precision (proportion of correct predictions versus the total number of predictions), and prediction index (ratio of direct hit rate versus proportion of total area predicted as high risk). Results indicate that it is possible to attain functional predictions by applying predictive analysis to grid-level crime data. The monthly predictions with a distinction between day and night produce better results overall than the bi-weekly predictions, indicating that the temporal resolution can have an important impact on the prediction performance.  相似文献   

12.
This paper examines the spatial and social distribution of the fear of crime and the relationships of such fear with aspects of the environment. Through an analysis of a questionnaire survey conducted in a variety of areas in Stoke-on-Trent in the English Midlands, it considers both the causes of the fear of crime and the associations that have been identified with other dimensions shaping vulnerability. It concludes by offering some guidance on how to address the differences between those populations who fear crime most and those who are most vulnerable.  相似文献   

13.
张艳林  李敏  刘宇文  李佳  侯钰婧 《地理科学》2022,42(6):993-1004
基于“学籍信息中的家庭地址承载了学生空间位置”这一假设,通过学籍信息收集了湖南省株洲县小学生的家庭地址,借助高德开放平台提供的地理编码和POI搜索服务,获得到了株洲县小学生的空间位置和分布,并基于最短路径分析和高斯型两步移动搜索法分析了株洲县小学教育资源的空间可达性及其特征,尝试为区域教育资源的空间均衡性分析与规划配置提供新的数据源和方法借鉴。结果表明:① 基于学籍地址和地理编码技术能够较准确地获取株洲县小学生的空间分布。② 株洲县小学生就近入学距离的最大值、平均值和中位数分别为11.83 km、2.10 km和1.81 km,就近入学距离小于2.0 km的学生仅占55.46%,为株洲县兼顾公平和效率的教育资源配置工作带来了挑战。③ 株洲县北部城镇地区因学校数量较多,平均就近入学距离较小,教育资源的空间可达性普遍较高,且空间差异小,均衡性好;而东南部的乡村地区,平均就近入学距离较大,教育资源的空间可达性普遍较低,且空间差异大。④ 基于情景分析,在不造成局地生源稳定性问题的前提下,新增3所学校后,东南部地区的平均就近入学距离和教育资源的空间可达性有很大的改善,龙潭镇和龙门镇的平均入学距离由3784 m和3520 m降低到3116 m和2636 m,教育资源的空间可达性分别由0.0492和0.0982提高到0.0762和0.1496。  相似文献   

14.
中国省域犯罪率影响因素的空间非平稳性分析   总被引:4,自引:2,他引:2  
严小兵 《地理科学进展》2013,32(7):1159-1166
收入差距和流动人口是影响犯罪率的两个重要因素, 以往研究基于OLS模型, 在假设地域空间为均质的前提下分析其对犯罪率的影响, 但现实世界的空间单元往往难以满足“均质”的假设, 多数表现为“空间异质”。以OLS计量空间异质会造成计量结果出现偏差, 同时无法了解不同空间单元的不同影响。而地理加权回归模型通过将空间结构嵌入线性回归模型中, 很好的解决了空间异质的计量问题。利用地理加权回归模型研究2008 年中国大陆省域单元犯罪率的影响因素, 结果表明:① 犯罪率的影响因素表现出空间非平稳性, 流动人口与犯罪率显著相关, 但各个省份相关程度并不相同, 影响关系随空间位置变化而变化;② 地理加权回归模型的计量精度和拟合度比OLS模型有大幅提高  相似文献   

15.
The use of geographically referenced point data, such as that obtained from global positioning systems (GPS), is rapidly increasing. However, due to error and uncertainty inherent in most geographic datasets, the ability to accurately associate these point locations with other layers of geographic data is still a challenge. One difficulty in particular is how to associate spatially and temporally referenced point-based observations of a network activity with a network topology such that a continuous network path can be best inferred. In this article, an optimization method for inferring a network path from a temporal sequence of point observations of location is presented. An application to GPS data is provided to highlight various characteristics of the proposed modeling approach relative to several other available techniques.  相似文献   

16.
ABSTRACT

Address matching is a crucial step in geocoding, which plays an important role in urban planning and management. To date, the unprecedented development of location-based services has generated a large amount of unstructured address data. Traditional address matching methods mainly focus on the literal similarity of address records and are therefore not applicable to the unstructured address data. In this study, we introduce an address matching method based on deep learning to identify the semantic similarity between address records. First, we train the word2vec model to transform the address records into their corresponding vector representations. Next, we apply the enhanced sequential inference model (ESIM), a deep text-matching model, to make local and global inferences to determine if two addresses match. To evaluate the accuracy of the proposed method, we fine-tune the model with real-world address data from the Shenzhen Address Database and compare the outputs with those of several popular address matching methods. The results indicate that the proposed method achieves a higher matching accuracy for unstructured address records, with its precision, recall, and F1 score (i.e., the harmonic mean of precision and recall) reaching 0.97 on the test set.  相似文献   

17.
ABSTRACT

Sporting events attract high volumes of people, which in turn leads to increased use of social media. In addition, research shows that sporting events may trigger violent behavior that can lead to crime. This study analyses the spatial relationships between crime occurrences, demographic, socio-economic and environmental variables, together with geo-located Twitter messages and their ‘violent’ subsets. The analysis compares basketball and hockey game days and non-game days. Moreover, this research aims to analyze crime prediction models using historical crime data as a basis and then introducing tweets and additional variables in their role as covariates of crime. First, this study investigates the spatial distribution of and correlation between crime and tweets during the same temporal periods. Feature selection models are applied in order to identify the best explanatory variables. Then, we apply localized kernel density estimation model for crime prediction during basketball and hockey games, and on non-game days. Findings from this study show that Twitter data, and a subset of violent tweets, are useful in building prediction models for the seven investigated crime types for home and away sporting events, and non-game days, with different levels of improvement.  相似文献   

18.
The original purpose of addresses was to enable the correct and unambiguous delivery of postal mail. The advent of computers and more specifically geographic information systems (GIS) opened up a whole new range of possibilities for the use of addresses, such as routing and vehicle navigation, spatial demographic analysis, geo‐marketing, and service placement and delivery. Such functionality requires a database which can store and access spatial data effectively. In this paper we present address databases and justify the need for national address databases. We describe models used for national address databases, and present our evaluation framework for an address database at a national level within the context of a spatial data infrastructure (SDI). The models of data harvesting, federated databases and data grids are analyzed and evaluated according to our novel framework, and we show that the data grid model has some unique features that make it attractive for a national address database in an environment where centralized control and/or coordination is difficult or undesirable.  相似文献   

19.
Chinese address segmentation is a serious challenge in geographic information system geocoding. Most previous studies have relied on predefined gazetteers without considering the information contained by a raw address corpus. In this paper, a hybrid method employing both rule-based and statistical methods is proposed for Chinese address segmentation without a predefined gazetteer. This approach utilizes statistical methods to extract address information from a raw address corpus and a rule-based method to segment Chinese addresses. Two typical statistical methods and their combinations with rule-based methods are compared with the hybrid method in an experiment involving approximately 460,000 address items in Shenzhen City, China. The experimental results indicate that the proposed method achieves an F-score of over 0.8, which is better than those of existing methods, thus validating the proposed method.  相似文献   

20.
修文群 《地理研究》2006,25(5):939-948
当前急剧增长的网络犯罪行为与有限警力、人工监控之间的结构性矛盾日益突出,针对网络犯罪的广泛性、隐蔽性、超时空性等特点,迫切需要开发应用先进技术手段,建立“网络犯罪空间管理系统”,使打击网络犯罪从突发事件、被动应对走向重点监控、主动预防。从当前公安网监部门实际需求出发,以地理信息系统为核心,结合网络搜索、IP追踪技术,建立“网络犯罪空间数据库”,进行相关空间数据挖掘,探索网络犯罪要素的空间结构、空间行为及其与环境间互动关系,以制定打击防范的对策方案。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号