首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Online representations of places are becoming pivotal in informing our understanding of urban life. Content production on online platforms is grounded in the geography of their users and their digital infrastructure. These constraints shape place representation, that is, the amount, quality, and type of digital information available in a geographic area. In this article we study the place representation of user‐generated content (UGC) in Los Angeles County, relating the spatial distribution of the data to its geo‐demographic context. Adopting a comparative and multi‐platform approach, this quantitative analysis investigates the spatial relationship between four diverse UGC datasets and their context at the census tract level (about 685,000 geo‐located tweets, 9,700 Wikipedia pages, 4 million OpenStreetMap objects, and 180,000 Foursquare venues). The context includes the ethnicity, age, income, education, and deprivation of residents, as well as public infrastructure. An exploratory spatial analysis and regression‐based models indicate that the four UGC platforms possess distinct geographies of place representation. To a moderate extent, the presence of Twitter, OpenStreetMap, and Foursquare data is influenced by population density, ethnicity, education, and income. However, each platform responds to different socio‐economic factors and clusters emerge in disparate hotspots. Unexpectedly, Twitter data tend to be located in denser, more deprived areas, and the geography of Wikipedia appears peculiar and harder to explain. These trends are compared with previous findings for the area of Greater London.  相似文献   

2.
Rapid flood mapping is critical for local authorities and emergency responders to identify areas in need of immediate attention. However, traditional data collection practices such as remote sensing and field surveying often fail to offer timely information during or right after a flooding event. Social media such as Twitter have emerged as a new data source for disaster management and flood mapping. Using the 2015 South Carolina floods as the study case, this paper introduces a novel approach to mapping the flood in near real time by leveraging Twitter data in geospatial processes. Specifically, in this study, we first analyzed the spatiotemporal patterns of flood-related tweets using quantitative methods to better understand how Twitter activity is related to flood phenomena. Then, a kernel-based flood mapping model was developed to map the flooding possibility for the study area based on the water height points derived from tweets and stream gauges. The identified patterns of Twitter activity were used to assign the weights of flood model parameters. The feasibility and accuracy of the model was evaluated by comparing the model output with official inundation maps. Results show that the proposed approach could provide a consistent and comparable estimation of the flood situation in near real time, which is essential for improving the situational awareness during a flooding event to support decision-making.  相似文献   

3.
With the rapid growth and popularity of mobile devices and location‐aware technologies, online social networks such as Twitter have become an important data source for scientists to conduct geo‐social network research. Non‐personal accounts, spam users and junk tweets, however, pose severe problems to the extraction of meaningful information and the validation of any research findings on tweets or twitter users. Therefore, the detection of such users is a critical and fundamental step for twitter‐related geographic research. In this study, we develop a methodological framework to: (1) extract user characteristics based on geographic, graph‐based and content‐based features of tweets; (2) construct a training dataset by manually inspecting and labeling a large sample of twitter users; and (3) derive reliable rules and knowledge for detecting non‐personal users with supervised classification methods. The extracted geographic characteristics of a user include maximum speed, mean speed, the number of different counties that the user has been to, and others. Content‐based characteristics for a user include the number of tweets per month, the percentage of tweets with URLs or Hashtags, and the percentage of tweets with emotions, detected with sentiment analysis. The extracted rules are theoretically interesting and practically useful. Specifically, the results show that geographic features, such as the average speed and frequency of county changes, can serve as important indicators of non‐personal users. For non‐spatial characteristics, the percentage of tweets with a high human factor index, the percentage of tweets with URLs, and the percentage of tweets with mentioned/replied users are the top three features in detecting non‐personal users.  相似文献   

4.
The use of social media data in geographic studies has become common, yet the question of social media's validity in such contexts is often overlooked. Social media data suffers from a variety of biases and limitations; nevertheless, with a proper understanding of the drawbacks, these data can be powerful. As cities seek to become “smarter,” they can potentially use social media data to creatively address the needs of their most vulnerable groups, such as ethnic minorities. However, questions remain unanswered regarding who uses these social networking platforms, how people use these platforms, and how representative social media data is of users' everyday lives. Using several forms of regression, I explore the relationships between a conventional data source (the U.S. Census) and a subset of Twitter data potentially representative of minority groups: tweets created by users with an account language other than English. A considerable amount of non‐stationarity is uncovered, which should serve as a warning against sweeping statements regarding the demographics of users and where people prefer to post. Further, I find that precisely located Twitter data informs us more about the digital status of places and less about users' day‐to‐day travel patterns.  相似文献   

5.
The implementation of social network applications on mobile platforms has significantly elevated the activity of mobile social networking. Mobile social networking offers a channel for recording an individual’s spatiotemporal behaviors when location-detecting capabilities of devices are enabled. It also facilitates the study of time geography on an individual level, which has previously suffered from a scarcity of georeferenced movement data. In this paper, we report on the use of georeferenced tweets to display and analyze the spatiotemporal patterns of daily user trajectories. For georeferenced tweets having both location information in longitude and latitude values and recorded creation time, we apply a space–time cube approach for visualization. Compared to the traditional methodologies for time geography studies such as the travel diary-based approach, the analytics using social media data present challenges broadly associated with those of Big Data, including the characteristics of high velocity, large volume, and heterogeneity. For this study, a batch processing system has been developed for extracting spatiotemporal information from each tweet and then creating trajectories of each individual mobile Twitter user. Using social media data in time geographic research has the benefits of study area flexibility, continuous observation and non-involvement with contributors. For example, during every 30-minute cycle, we collected tweets created by about 50,000 Twitter users living in a geographic region covering New York City to Washington, DC. Each tweet can indicate the exact location of its creator when the tweet was posted. Thus, the linked tweets show a Twitter users’ movement trajectory in space and time. This study explores using data intensive computing for processing Twitter data to generate spatiotemporal information that can recreate the space–time trajectories of their creators.  相似文献   

6.
User interaction in social networks, such as Twitter and Facebook, is increasingly becoming a source of useful information on daily events. The online monitoring of short messages posted in such networks often provides insight on the repercussions of events of several different natures, such as (in the recent past) the earthquake and tsunami in Japan, the royal wedding in Britain and the death of Osama bin Laden. Studying the origins and the propagation of messages regarding such topics helps social scientists in their quest for improving the current understanding of human relationships and interactions. However, the actual location associated to a tweet or to a Facebook message can be rather uncertain. Some tweets are posted with an automatically determined location (from an IP address), or with a user‐informed location, both in text form, usually the name of a city. We observe that most Twitter users opt not to publish their location, and many do so in a cryptic way, mentioning non‐existing places or providing less specific place names (such as “Brazil”). In this article, we focus on the problem of enriching the location of tweets using alternative data, particularly the social relationships between Twitter users. Our strategy involves recursively expanding the network of locatable users using following‐follower relationships. Verification is achieved using cross‐validation techniques, in which the location of a fraction of the users with known locations is used to determine the location of the others, thus allowing us to compare the actual location to the inferred one and verify the quality of the estimation. With an estimate of the precision of the method, it can then be applied to locationless tweets. Our intention is to infer the location of as many users as possible, in order to increase the number of tweets that can be used in spatial analyses of social phenomena. The article demonstrates the feasibility of our approach using a dataset comprising tweets that mention keywords related to dengue fever, increasing by 45% the number of locatable tweets.  相似文献   

7.
Social media messages, such as tweets, are frequently used by people during natural disasters to share real‐time information and to report incidents. Within these messages, geographic locations are often described. Accurate recognition and geolocation of these locations are critical for reaching those in need. This article focuses on the first part of this process, namely recognizing locations from social media messages. While general named entity recognition tools are often used to recognize locations, their performance is limited due to the various language irregularities associated with social media text, such as informal sentence structures, inconsistent letter cases, name abbreviations, and misspellings. We present NeuroTPR, which is a Neuro‐net ToPonym Recognition model designed specifically with these linguistic irregularities in mind. Our approach extends a general bidirectional recurrent neural network model with a number of features designed to address the task of location recognition in social media messages. We also propose an automatic workflow for generating annotated data sets from Wikipedia articles for training toponym recognition models. We demonstrate NeuroTPR by applying it to three test data sets, including a Twitter data set from Hurricane Harvey, and comparing its performance with those of six baseline models.  相似文献   

8.
Residential locations play an important role in understanding the form and function of urban systems. However, it is impossible to release this detailed information publicly, due to the issue of privacy. The rapid development of location‐based services and the prevalence of global position system (GPS)‐equipped devices provide an unprecedented opportunity to infer residential locations from user‐generated geographic information. This article compares different approaches for predicting Twitter users' home locations at a precise point level based on temporal and spatial features extracted from geo‐tagged tweets. Among the three deterministic approaches, the one that estimates the home location for each user by finding the weighted most frequently visited (WMFV) cluster of that user always provides the best performance when compared with the other two methods. The results of a fourth approach, based on the support vector machine (SVM), are severely affected by the threshold value for a cluster to be identified as the home.  相似文献   

9.
This study presents a method to model population densities by using image texture statistics of semi-variance. In a case study of the City of Austin, Texas, we first selected sample census blocks of the same land use to build population models by land use. Regression analyses were conducted to infer the relationship between block population densities and image texture statistics of the semi-variance. We then applied the population models to an area of 251 blocks to estimate populations for within-blocks land-use areas while maintaining census block populations. To assess the proposed method, the same analysis was performed while census block-group populations were maintained, and the aggregated block populations were compared with original census block populations. We also tested a conventional land-use-based dasymetric mapping method with pre-calculated population densities for land uses. The results show that our approach, which is based on initial land-use stratification and further image-texture statistical modeling of population, has higher accuracy statistics than the conventional land-use-based dasymetric mapping method.  相似文献   

10.
Social media networks allow users to post what they are involved in with location information in a real‐time manner. It is therefore possible to collect large amounts of information related to local events from existing social networks. Mining this abundant information can feed users and organizations with situational awareness to make responsive plans for ongoing events. Despite the fact that a number of studies have been conducted to detect local events using social media data, the event content is not efficiently summarized and/or the correlation between abnormal neighboring regions is not investigated. This article presents a spatial‐temporal‐semantic approach to local event detection using geo‐social media data. Geographical regularities are first measured to extract spatio‐temporal outliers, of which the corresponding tweet content is automatically summarized using the topic modeling method. The correlation between outliers is subsequently examined by investigating their spatial adjacency and semantic similarity. A case study on the 2014 Toronto International Film Festival (TIFF) is conducted using Twitter data to evaluate our approach. This reveals that up to 87% of the events detected are correctly identified compared with the official TIFF schedule. This work is beneficial for authorities to keep track of urban dynamics and helps build smart cities by providing new ways of detecting what is happening in them.  相似文献   

11.
Blogs, micro‐blogs and online forums underpin a more interconnected world. People communicate ever more and are increasingly keen to explain and illustrate their lives; showing where they are and what they are doing. Desktop, online and mobile mapping landscapes have never been as rich or diverse yet this challenges cartography to adapt and remain relevant in the modern mapping world. We explore the spatial expression and potential value of micro‐blogging and Twitter as a social networking tool. Examples of “twitter maps” are reviewed that leverage the Twitter API and online map services to locate some component of the “tweet”. Scope, function and design are illustrated through development of two proof‐of‐concept map mashups that support collaborative real‐time mapping and the organisation and display of information for mass user events. Through the experiments in using and organising data in this way we demonstrate the value of “cartoblography”– a framework for mapping the spatial context of micro‐blogging.  相似文献   

12.
ABSTRACT

Although Twitter is used for emergency management activities, the relevance of tweets during a hazard event is still open to debate. In this study, six different computational (i.e. Natural Language Processing) and spatiotemporal analytical approaches were implemented to assess the relevance of risk information extracted from tweets obtained during the 2013 Colorado flood event. Primarily, tweets containing information about the flooding events and its impacts were analysed. Examination of the relationships between tweet volume and its content with precipitation amount, damage extent, and official reports revealed that relevant tweets provided information about the event and its impacts rather than any other risk information that public expects to receive via alert messages. However, only 14% of the geo-tagged tweets and only 0.06% of the total fire hose tweets were found to be relevant to the event. By providing insight into the quality of social media data and its usefulness to emergency management activities, this study contributes to the literature on quality of big data. Future research in this area would focus on assessing the reliability of relevant tweets for disaster related situational awareness.  相似文献   

13.
SensePlace3 (SP3) is a geovisual analytics framework and web application that supports overview + detail analysis of social media, focusing on extracting meaningful information from the Twitterverse. SP3 leverages social media related to crisis events. It differs from most existing systems by enabling an analyst to obtain place-relevant information from tweets that have implicit as well as explicit geography. Specifically, SP3 includes not just the ability to utilize the explicit geography of geolocated tweets but also analyze implicit geography by recognizing and geolocating references in both tweet text, which indicates locations tweeted about, and in Twitter profiles, which indicates locations affiliated with users. Key features of SP3 reported here include flexible search and filtering capabilities to support information foraging; an ingest, processing, and indexing pipeline that produces near real-time access for big streaming data; and a novel strategy for implementing a web-based multi-view visual interface with dynamic linking of entities across views. The SP3 system architecture was designed to support crisis management applications, but its design flexibility makes it easily adaptable to other domains. We also report on a user study that provided input to SP3 interface design and suggests next steps for effective spatiotemporal analytics using social media sources.  相似文献   

14.
ABSTRACT

Mapping built land cover at unprecedented detail has been facilitated by increasing availability of global high-resolution imagery and image processing methods. These advances in urban feature extraction and built-area detection can refine the mapping of human population densities, especially in lower income countries where rapid urbanization and changing population is accompanied by frequently out-of-date or inaccurate census data. However, in these contexts it is unclear how best to use built-area data to disaggregate areal, count-based census data. Here we tested two methods using remotely sensed, built-area land cover data to disaggregate population data. These included simple, areal weighting and more complex statistical models with other ancillary information. Outcomes were assessed across eleven countries, representing different world regions varying in population densities, types of built infrastructure, and environmental characteristics. We found that for seven of 11 countries a Random Forest-based, machine learning approach outperforms simple, binary dasymetric disaggregation into remotely-sensed built areas. For these more complex models there was little evidence to support using any single built land cover input over the rest, and in most cases using more than one built-area data product resulted in higher predictive capacity. We discuss these results and implications for future population modeling approaches.  相似文献   

15.
ABSTRACT

Massive social media data produced from microblog platforms provide a new data source for studying human dynamics at an unprecedented scale. Meanwhile, population bias in geotagged Twitter users is widely recognized. Understanding the demographic and socioeconomic biases of Twitter users is critical for making reliable inferences on the attitudes and behaviors of the population. However, the existing global models cannot capture the regional variations of the demographic and socioeconomic biases. To bridge the gap, we modeled the relationships between different demographic/socioeconomic factors and geotagged Twitter users for the whole contiguous United States, aiming to understand how the demographic and socioeconomic factors relate to the number of Twitter users at county level. To effectively identify the local Twitter users for each county of the United States, we integrate three commonly used methods and develop a query approach in a high-performance computing environment. The results demonstrate that we can not only identify how the demographic and socioeconomic factors relate to the number of Twitter users, but can also measure and map how the influence of these factors vary across counties.  相似文献   

16.
Widespread use of social media during crises has become commonplace, as shown by the volume of messages during the Haiti earthquake of 2010 and Japan tsunami of 2011. Location mentions are particularly important in disaster messages as they can show emergency responders where problems have occurred. This article explores the sorts of locations that occur in disaster‐related social messages, how well off‐the‐shelf software identifies those locations, and what is needed to improve automated location identification, called geo‐parsing. To do this, we have sampled Twitter messages from the February 2011 earthquake in Christchurch, Canterbury, New Zealand. We annotated locations in messages manually to make a gold standard by which to measure locations identified by a Named Entity Recognition software. The Stanford NER software found some locations that were proper nouns, but did not identify locations that were not capitalized, local streets and buildings, or non‐standard place abbreviations and mis‐spellings that are plentiful in microtext. We review how these problems might be solved in software research, and model a readable crisis map that shows crisis location clusters via enlarged place labels.  相似文献   

17.
This paper describes techniques to compute and map dasymetric population densities and to areally interpolate census data using dasymetrically derived population weights. These techniques are demonstrated with 1980-2000 census data from the 13-county Atlanta metropolitan area. Land-use/land-cover data derived from remotely sensed satellite imagery were used to determine the areal extent of populated areas, which in turn served as the denominator for dasymetric population density computations at the census tract level. The dasymetric method accounts for the spatial distribution of population within administrative areas, yielding more precise population density estimates than the choroplethic method, while graphically representing the geographic distribution of populations. In order to areally interpolate census data from one set of census tract boundaries to another, the percentages of populated areas affected by boundary changes in each affected tract were used as adjustment weights for census data at the census tract level, where census tract boundary shifts made temporal data comparisons difficult. This method of areal interpolation made it possible to represent three years of census data (1980, 1990, and 2000) in one set of common census tracts (1990). Accuracy assessment of the dasymetrically derived adjustment weights indicated a satisfactory level of accuracy. Dasymetrically derived areal interpolation weights can be applied to any type of geographic boundary re-aggregation, such as from census tracts to zip code tabulation areas, from census tracts to local school districts, from zip code areas to telephone exchange prefix areas, and for electoral redistricting.  相似文献   

18.
China's social media platform, Sina Weibo, like Twitter, hosts a considerable amount of big data: messages, comments, pictures. Collecting and analyzing information from this treasury of human behavior data is a challenge, although the message exchange on the network is readable by everyone through the web or app interface. The official Application Programming Interface (API) is the gateway to access and download public content from Sina Weibo and is used to collect messages for all mainland China. The nearby_timeline() request is used to harvest only messages with associated location information. This technical note serves as a reference for researchers who do not speak Mandarin but want to collect data from this rich source of information. Ways of data visualization are presented as a point cloud, density per areal unit, or clustered using Density‐Based Spatial Clustering of Applications with Noise (DBSCAN). The relation of messages to census information is also given.  相似文献   

19.
ABSTRACT

Data availability is a persistent constraint in social policy analysis. Web 2.0 technologies could provide valuable new data sources, but first, their potentials and limitations need to be investigated. This paper reports on a method using Twitter data for deriving indications of active citizenship, taken as an example of social indicators. Active citizenship is a dimension of social capital, empowering communities and reducing possibilities of social exclusion. However, classical measurements of active citizenship are generally costly and time-consuming. This paper looks at one of such classic indicators, namely, responses to the survey question ‘contacts to politicians’. It compares official survey results in Spain with findings from an analysis of Twitter data. Each method presents its own strengths and weakness, thus best results may be achieved by the combination of both. Official surveys have the clear advantage of being statistically robust and representative of a total population. Instead, Twitter data offer more timely and less costly information, with higher spatial and temporal resolution. This paper presents our full methodological workflow for analysing and comparing these two data sources. The research results advance the debate on how social media data could be mined for policy analysis.  相似文献   

20.
Existing predictive mapping methods usually require a large number of field samples with good representativeness as input to build reliable predictive models. In mapping practice, however, we often face situations when only small sample data are available. In this article, we present a semi‐supervised machine learning approach for predictive mapping in which the natural aggregation (clustering) patterns of environmental covariate data are used to supplement limited samples in prediction. This approach was applied to two soil mapping case studies. Compared with field sample only approaches (decision trees, logistic regression, and support vector machines), maps using the proposed approach can better capture the spatial variation of soil types and achieve higher accuracy with limited samples. A cross validation shows further that the proposed approach is less sensitive to the specific field sample set used and thus more robust when field sample data are small.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号