首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
ABSTRACT

The investigation of human activity patterns from location-based social networks like Twitter is an established approach of how to infer relationships and latent information that characterize urban structures. Researchers from various disciplines have performed geospatial analysis on social media data despite the data’s high dimensionality, complexity and heterogeneity. However, user-generated datasets are of multi-scale nature, which results in limited applicability of commonly known geospatial analysis methods. Therefore in this paper, we propose a geographic, hierarchical self-organizing map (Geo-H-SOM) to analyze geospatial, temporal and semantic characteristics of georeferenced tweets. The results of our method, which we validate in a case study, demonstrate the ability to explore, abstract and cluster high-dimensional geospatial and semantic information from crowdsourced data.  相似文献   

2.
The remarkable success of online social media sites marks a shift in the way people connect and share information. Much of this information now contains some form of geographical content because of the proliferation of location-aware devices, thus fostering the emergence of geosocial media – a new type of user-generated geospatial information. Through geosocial media we are able, for the first time, to observe human activities in scales and resolutions that were so far unavailable. Furthermore, the wide spectrum of social media data and service types provides a multitude of perspectives on real-world activities and happenings, thus opening new frontiers in geosocial knowledge discovery. However, gleaning knowledge from geosocial media is a challenging task, as they tend to be unstructured and thematically diverse. To address these challenges, this article presents a system prototype for harvesting, processing, modeling, and integrating heterogeneous social media feeds towards the generation of geosocial knowledge. Our article addresses primarily two key components of this system prototype: a novel data model for heterogeneous social media feeds and a corresponding general system architecture. We present these key components and demonstrate their implementation in our system prototype, GeoSocial Gauge.  相似文献   

3.
In machine learning, one often assumes the data are independent when evaluating model performance. However, this rarely holds in practice. Geographic information datasets are an example where the data points have stronger dependencies among each other the closer they are geographically. This phenomenon known as spatial autocorrelation (SAC) causes the standard cross validation (CV) methods to produce optimistically biased prediction performance estimates for spatial models, which can result in increased costs and accidents in practical applications. To overcome this problem, we propose a modified version of the CV method called spatial k-fold cross validation (SKCV), which provides a useful estimate for model prediction performance without optimistic bias due to SAC. We test SKCV with three real-world cases involving open natural data showing that the estimates produced by the ordinary CV are up to 40% more optimistic than those of SKCV. Both regression and classification cases are considered in our experiments. In addition, we will show how the SKCV method can be applied as a criterion for selecting data sampling density for new research area.  相似文献   

4.
Existing urban boundaries are usually defined by government agencies for administrative, economic, and political purposes. However, it is not clear whether the boundaries truly reflect human interactions with urban space in intra- and interregional activities. Defining urban boundaries that consider socioeconomic relationships and citizen commute patterns is important for many aspects of urban and regional planning. In this paper, we describe a method to delineate urban boundaries based upon human interactions with physical space inferred from social media. Specifically, we depicted the urban boundaries of Great Britain using a mobility network of Twitter user spatial interactions, which was inferred from over 69 million geo-located tweets. We define the non-administrative anthropographic boundaries in a hierarchical fashion based on different physical movement ranges of users derived from the collective mobility patterns of Twitter users in Great Britain. The results of strongly connected urban regions in the form of communities in the network space yield geographically cohesive, nonoverlapping urban areas, which provide a clear delineation of the non-administrative anthropographic urban boundaries of Great Britain. The method was applied to both national (Great Britain) and municipal scales (the London metropolis). While our results corresponded well with the administrative boundaries, many unexpected and interesting boundaries were identified. Importantly, as the depicted urban boundaries exhibited a strong instance of spatial proximity, we employed a gravity model to understand the distance decay effects in shaping the delineated urban boundaries. The model explains how geographical distances found in the mobility patterns affect the interaction intensity among different non-administrative anthropographic urban areas, which provides new insights into human spatial interactions with urban space.  相似文献   

5.
As they increase in popularity, social media are regarded as important sources of information on geographical phenomena. Studies have also shown that people rely on social media to communicate during disasters and emergency situation, and that the exchanged messages can be used to get an insight into the situation. Spatial data mining techniques are one way to extract relevant information from social media. In this article, our aim is to contribute to this field by investigating how graph clustering can be applied to support the detection of geo-located communities in Twitter in disaster situations. For this purpose, we have enhanced the fast-greedy optimization of modularity (FGM) clustering algorithm with semantic similarity so that it can deal with the complex social graphs extracted from Twitter. Then, we have coupled the enhanced FGM with the varied density-based spatial clustering of applications with noise spatial clustering algorithm to obtain spatial clusters at different temporal snapshots. The method was experimented with a case study on typhoon Haiyan in the Philippines, and Twitter’s different interaction modes were compared to create the graph of users and to detect communities. The experiments show that communities that are relevant to identify areas where disaster-related incidents were reported can be extracted, and that the enhanced algorithm outperforms the generic one in this task.  相似文献   

6.
In this article we analyze a well-known and extensively researched problem: how to find all datasets, on the one hand, and on the other hand only those that are of value to the user when dealing with a specific spatially oriented task. In analogy with existing approaches to a similar problem from other fields of human endeavor, we call this software solution ‘a spatial data recommendation service.’ In its final version, this service should be capable of matching requests created in the user's mind with the content of the existing datasets, while taking into account the user's preferences obtained from the user's previous use of the service. As a result, the service should recommend a list of datasets best suited to the user's needs. In this regard, we consider metadata, particularly natural language definitions of spatial entities, a crucial piece of the solution. To be able to use this information in the process of matching the user's request with the dataset content, this information must be semantically preprocessed. To automate this task we have applied a machine learning approach. With inductive logic programming (ILP) our system learns rules that identify and extract values for the five most frequent relations/properties found in Slovene natural language definitions of spatial entities. The initially established quality criterion for identifying and extracting information was met in three out of five examples. Therefore we conclude that ILP offers a promising approach to developing an information extraction component of a spatial data recommendation service.  相似文献   

7.
Environmental simulation models need automated geographic data reduction methods to optimize the use of high-resolution data in complex environmental models. Advanced map generalization methods have been developed for multiscale geographic data representation. In the case of map generalization, positional, geometric and topological constraints are focused on to improve map legibility and communication of geographic semantics. In the context of environmental modelling, in addition to the spatial criteria, domain criteria and constraints also need to be considered. Currently, due to the absence of domain-specific generalization methods, modellers resort to ad hoc methods of manual digitization or use cartographic methods available in off-the-shelf software. Such manual methods are not feasible solutions when large data sets are to be processed, thus limiting modellers to the single-scale representations. Automated map generalization methods can rarely be used with confidence because simplified data sets may violate domain semantics and may also result in suboptimal model performance. For best modelling results, it is necessary to prioritize domain criteria and constraints during data generalization. Modellers should also be able to automate the generalization techniques and explore the trade-off between model efficiency and model simulation quality for alternative versions of input geographic data at different geographic scales. Based on our long-term research with experts in the analytic element method of groundwater modelling, we developed the multicriteria generalization (MCG) framework as a constraint-based approach to automated geographic data reduction. The MCG framework is based on the spatial multicriteria decision-making paradigm since multiscale data modelling is too complex to be fully automated and should be driven by modellers at each stage. Apart from a detailed discussion of the theoretical aspects of the MCG framework, we discuss two groundwater data modelling experiments that demonstrate how MCG is not just a framework for automated data reduction, but an approach for systematically exploring model performance at multiple geographic scales. Experimental results clearly indicate the benefits of MCG-based data reduction and encourage us to continue expanding the scope of and implement MCG for multiple application domains.  相似文献   

8.
The importance of urban growth processes and their spatial characteristics has led to a growing interest in monitoring these phenomena. Spatial metrics are widely employed for this purpose, appearing in an increasing number of studies where they are used to characterise growth patterns and their evolution over time. This paper presents an analysis of urban growth patterns using spatial metrics in the Algarve (southern Portugal), an area of considerable urban dynamics associated with tourism. Two datasets were used (CORINE 1:100,000 maps and COS 1:25,000 maps) and two time periods (1990 and 2006–2007) in order to compare the different urban land use patterns detected and their evolution over time. The results show differences in urban land use patterns and associated processes at each scale, with stable land use patterns predominating at the 1:100,000 scale whereas the 1:25,000 scale showed a move towards more dispersed patterns. These results have enabled an assessment of the principal differences in urban growth patterns observed at both scales, and the implications for planning these entail.  相似文献   

9.
The movements of ideas and content between locations and languages are unquestionably crucial concerns to researchers of the information age, and Twitter has emerged as a central, global platform on which hundreds of millions of people share knowledge and information. A variety of research has attempted to harvest locational and linguistic metadata from tweets to understand important questions related to the 300 million tweets that flow through the platform each day. Much of this work is carried out with only limited understandings of how best to work with the spatial and linguistic contexts in which the information was produced, however. Furthermore, standard, well-accepted practices have yet to emerge. As such, this article studies the reliability of key methods used to determine language and location of content in Twitter. It compares three automated language identification packages to Twitter's user interface language setting and to a human coding of languages to identify common sources of disagreement. The article also demonstrates that in many cases user-entered profile locations differ from the physical locations from which users are actually tweeting. As such, these open-ended, user-generated profile locations cannot be used as useful proxies for the physical locations from which information is published to Twitter.  相似文献   

10.
Global multi-layer network of human mobility   总被引:2,自引:0,他引:2  
Recent availability of geo-localized data capturing individual human activity together with the statistical data on international migration opened up unprecedented opportunities for a study on global mobility. In this paper, we consider it from the perspective of a multi-layer complex network, built using a combination of three datasets: Twitter, Flickr and official migration data. Those datasets provide different, but equally important insights on the global mobility – while the first two highlight short-term visits of people from one country to another, the last one – migration – shows the long-term mobility perspective, when people relocate for good. The main purpose of the paper is to emphasize importance of this multi-layer approach capturing both aspects of human mobility at the same time. On the one hand, we show that although the general properties of different layers of the global mobility network are similar, there are important quantitative differences among them. On the other hand, we demonstrate that consideration of mobility from a multi-layer perspective can reveal important global spatial patterns in a way more consistent with those observed in other available relevant sources of international connections, in comparison to the spatial structure inferred from each network layer taken separately.  相似文献   

11.
Social Network Analysis offers powerful tools to analyze the structure of relationships between a set of people. However, the addition of spatial information poses new challenges, as nodes are embedded simultaneously in network space and Euclidean space. While nearby nodes may not form social ties, ties may exist at a distance, a configuration ill-suited for traditional spatial metrics that assume adjacent objects are related. As such, there are relatively few metrics to describe these nuanced situations. We advance the burgeoning field of spatial social network analysis by introducing a set of new metrics. Specifically, we introduce the spatial social network schema, tuning parameter and the flattening ratio, each of which leverages the notion of ‘distance’ to augment insights obtained by relying on topology alone. These methods are used to answer the questions: What is the social and spatial structure of the network? Who are the key individuals at different spatial scales? We use two synthetic networks with properties mimicking the ones reported in the literature as validation datasets and a case study of employer–employee network. The methods characterize the employer–employee as spatially loose with predominantly local connections and identify key individuals responsible for keeping the network connected at different spatial scales.  相似文献   

12.
The spatial hierarchy of part-whole relationships is an essential characteristic of the platial world. Constructing spatial hierarchies of places is valuable in association analysis and qualitative spatial reasoning. The emergence of large amounts of geotagged user-generated content provides strong support for modelling places. However, the vague nature of places and the complex spatial relationships among places make it intractable to understand and represent the hierarchies among places. In this paper, we introduce a fuzzy formal concept analysis-based approach to uncovering the spatial hierarchies among vague places. Each place is represented as a concept that consists of its extent and its intent. Based on the place concepts, the spatial hierarchies are generated and expressed as a graph that is easy to comprehend and contains abundant information on spatial relations. We also demonstrate the rationality of our result by comparing it with the result of a questionnaire survey.  相似文献   

13.
ABSTRACT

Crime often clusters in space and time. Near-repeat patterns improve understanding of crime communicability and their space–time interactions. Near-repeat analysis requires extensive computing resources for the assessment of statistical significance of space–time interactions. A computationally intensive Monte Carlo simulation-based approach is used to evaluate the statistical significance of the space-time patterns underlying near-repeat events. Currently available software for identifying near-repeat patterns is not scalable for large crime datasets. In this paper, we show how parallel spatial programming can help to leverage spatio-temporal simulation-based analysis in large datasets. A parallel near-repeat calculator was developed and a set of experiments were conducted to compare the newly developed software with an existing implementation, assess the performance gain due to parallel computation, test the scalability of the software to handle large crime datasets and assess the utility of the new software for real-world crime data analysis. Our experimental results suggest that, efficiently designed parallel algorithms that leverage high-performance computing along with performance optimization techniques could be used to develop software that are scalable with large datasets and could provide solutions for computationally intensive statistical simulation-based approaches in crime analysis.  相似文献   

14.
Tracking technologies are able to provide high-resolution movement data that can advance research in different fields, such as tourism management. In this specific field, developing methods to extract moving flock patterns from such data are particularly relevant to enable us to improve our knowledge of the nature of recreational use interactions, which is crucial for a good management of attractions and for designing sustainable development policies. However, ‘flocking’ has been usually associated with the form of collective movement of a large group of birds, fish, insects and certain mammals as well. Very few research efforts have been devoted in finding flock patterns associated with pedestrian movement. In this work, we propose a moving flock pattern definition and a corresponding extraction algorithm based on the notion of collective coherence. We use the term collective coherence to refer to the spatial closeness over some time duration with a minimum number of members. Furthermore, we evaluate the proposed algorithm by applying it to two different pedestrian movement datasets, which have been gathered from visitors of two recreational parks. The results show that the algorithm is capable of extracting moving flock patterns, disqualifying the patterns with flock members that remain stationary in a common place during the considered time interval.  相似文献   

15.
This paper presents methods to evaluate the geometric quality of spatial data. Firstly, a point‐based method is presented, adapting conventional assessment methods whereby common points between datasets are compared. In our approach, initial matches are established automatically and refined further through interactive editing. Second, a line‐based method which uses correspondences between line segments is proposed. Here, the geometry of line segments in vector is transformed into a set of rasterized values so that their combination at each pixel can restore their original vector geometry. Matching is performed on rasterized line segments and their matching lengths and displacements are measured. Experimental results show that the line‐based approach proposed is efficient to evaluate the geometric quality of spatial data without requirements of topological relationships among line features.  相似文献   

16.
The 2015 Middle East respiratory syndrome (MERS) outbreak in South Korea gave rise to chaos caused by psychological anxiety, and it has been assumed that people shared rumors about hospital lists through social media. Sharing rumors is a common form of public perception and risk communication among individuals during an outbreak. Social media analysis offers an important window into the spatiotemporal patterns of public perception and risk communication about disease outbreaks. Such processes of socially mediated risk communication are a process of meme diffusion. This article aims to investigate the role of social media meme diffusion and its spatiotemporal patterns in public perception and risk communication. To do so, we applied analytical methods including the daily number of tweets for metropolitan cities and geovisualization with the weighted mean centers. The spatiotemporal patterns shown by Twitter users' interests in specific places, triggered by real space events, demonstrate the spatial interactions among places in public perception and risk communication. Public perception and risk communication about places are relevant to both social networks and spatial proximity to where Twitter users live and are interpreted in reference to both Zipf's law and Tobler's law.  相似文献   

17.
ABSTRACT

Individual activity patterns are influenced by a wide variety of factors. The more important ones include socioeconomic status (SES) and urban spatial structure. While most previous studies relied heavily on the expensive travel-diary type data, the feasibility of using social media data to support activity pattern analysis has not been evaluated. Despite the various appealing aspects of social media data, including low acquisition cost and relatively wide geographical and international coverage, these data also have many limitations, including the lack of background information of users, such as home locations and SES. A major objective of this study is to explore the extent that Twitter data can be used to support activity pattern analysis. We introduce an approach to determine users’ home and work locations in order to examine the activity patterns of individuals. To infer the SES of individuals, we incorporate the American Community Survey (ACS) data. Using Twitter data for Washington, DC, we analyzed the activity patterns of Twitter users with different SESs. The study clearly demonstrates that while SES is highly important, the urban spatial structure, particularly where jobs are mainly found and the geographical layout of the region, plays a critical role in affecting the variation in activity patterns between users from different communities.  相似文献   

18.
In integration of road maps modeled as road vector data, the main task is matching pairs of objects that represent, in different maps, the same segment of a real-world road. In an ad hoc integration, the matching is done for a specific need and, thus, is performed in real time, where only a limited preprocessing is possible. Usually, ad hoc integration is performed as part of some interaction with a user and, hence, the matching algorithm is required to complete its task in time that is short enough for human users to provide feedback to the application, that is, in no more than a few seconds. Such interaction is typical of services on the World Wide Web and to applications in car-navigation systems or in handheld devices.

Several algorithms were proposed in the past for matching road vector data; however, these algorithms are not efficient enough for ad hoc integration. This article presents algorithms for ad hoc integration of maps in which roads are represented as polylines. The main novelty of these algorithms is in using only the locations of the endpoints of the polylines rather than trying to match whole lines. The efficiency of the algorithms is shown both analytically and experimentally. In particular, these algorithms do not require the existence of a spatial index, and they are more efficient than an alternative approach based on using a grid index. Extensive experiments using various maps of three different cities show that our approach to matching road networks is efficient and accurate (i.e., it provides high recall and precision).

General Terms:Algorithms, Experimentation  相似文献   

19.
As an important forcing data for hydrologic models, precipitation has significant effects on model simulation. The China Meteorological Forcing Dataset (ITP) and Global Land Data Assimilation System (G...  相似文献   

20.
Mapping ecosystem services (ES) over large scales is important for environmental monitoring but is often prohibitively expensive and difficult. We test a hybrid, low-cost method of mapping ES indicators over large scales in Pará State, Brazil. Four ES indicators (vegetation carbon stocks, biodiversity index, soil chemical quality index and rates of water infiltration into soil) were measured in the field and then summarized spatially for regional land-cover classes derived from satellite imagery. The regionally mapped ES values correlated strongly with independent and local measures of ES. For example, regional estimates of the vegetation carbon stocks are strongly correlated with actual measures derived from field samples and validation data (significant anova test – p-value = 4.51e?9) and differed on average by only 20 Mg/ha from the field data. Our spatially-nested approach provides reliable and accurate maps of ES at both local and regional scales. Local maps account for the specificities of an area while regional maps provide an accurate generalization of an ES’ state. Such up-scaling methods infuse large-scale ES maps with localized data and enable the estimation of uncertainty of at regional scales. Our approach is first step towards the spatial characterization of ES at large and potentially global scales.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号