首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Wildlife ecologists frequently make use of limited information on locations of a species of interest in combination with readily available GIS data to build models to predict space use. In addition to a wide range of statistical data models that are more commonly used, machine learning approaches provide another means to develop predictive spatial models. However, comparison of output from these two families of models for the same data set is not often carried out. It is important that wildlife managers understand the pitfalls and limitations when a single set of models is used with limited GIS data to try to predict and understand species distribution. To illustrate this, we carried out two sets of models (generalized linear mixed models (GLMMs) and boosted regression trees (BRTs)) to predict geographic occupancy of the eastern coyote (Canis latrans) on the island of Newfoundland, Canada. This exercise is illustrative of common spatial questions in wildlife research and management. Our results show that models vary depending on the approach (GLMM vs. BRT) and that, overall, BRT had higher predictive ability. Although machine learning has been criticized because it is not explicitly hypothesis-driven, it has been used in other areas of spatial modelling with success. Here, we demonstrate that it may be a useful approach for predicting wildlife space use and to generate hypotheses when data are limited. The results of this comparison can help to improve other models for species distributions and also guide future sampling and modelling initiatives.  相似文献   

2.
Regularized discriminant analysis has proven to be a most effective classifier for problems wheretraditional classifiers fail because of a lack of sufficient training samples,as is often the case in high-dimensional settings.However,it has been shown that the model selection procedure of regularizeddiscriminant analysis,determining the degree of regularization,has some deficiencies associated with it.We propose a modified model selection procedure based on a new appreciation function.By means ofan extensive simulation it was shown that the new model selection procedure performs better than theoriginal one.We also propose that one of the control parameters of regularized discriminant analysis beallowed to take on negative values.This extension leads to an improved performance in certain situations.The results are confirmed using two chemical data sets.  相似文献   

3.
ABSTRACT

High performance computing is required for fast geoprocessing of geospatial big data. Using spatial domains to represent computational intensity (CIT) and domain decomposition for parallelism are prominent strategies when designing parallel geoprocessing applications. Traditional domain decomposition is limited in evaluating the computational intensity, which often results in load imbalance and poor parallel performance. From the data science perspective, machine learning from Artificial Intelligence (AI) shows promise for better CIT evaluation. This paper proposes a machine learning approach for predicting computational intensity, followed by an optimized domain decomposition, which divides the spatial domain into balanced subdivisions based on the predicted CIT to achieve better parallel performance. The approach provides a reference framework on how various machine learning methods including feature selection and model training can be used in predicting computational intensity and optimizing parallel geoprocessing against different cases. Some comparative experiments between the approach and traditional methods were performed using the two cases, DEM generation from point clouds and spatial intersection on vector data. The results not only demonstrate the advantage of the approach, but also provide hints on how traditional GIS computation can be improved by the AI machine learning.  相似文献   

4.
One common problem with geographic data is that, for a specific geographic event, only occurrence information is available; information about the absence of the event is not available. We refer to these specific types of geospatial data as geographic one-class data (GOCD). Predicting the potential spatial distributions that a particular geographic event may occur from GOCD is difficult because traditional binary classification methods that require availability of both positive and negative training samples cannot be used. The objective of this research is to define GOCD and propose novel approaches for modelling potential spatial distributions of geographic events using GOCD. We investigate the effectiveness of one-class support vector machine (OCSVM), maximum entropy (MAXENT) and the newly proposed positive and unlabelled learning (PUL) algorithm for solving GOCD problems using a case study: species distribution modelling from synthetic data. Our experimental results indicate that generally OCSVM, MAXENT and PUL are effective in modelling the GOCD. Each method has advantages and disadvantages, but PUL seems to be the most promising method.  相似文献   

5.

With an increasing demand for raw materials, predictive models that support successful mineral exploration targeting are of great importance. We evaluated different machine learning techniques with an emphasis on boosting algorithms and implemented them in an ArcGIS toolbox. Performance was tested on an exploration dataset from the Iberian Pyrite Belt (IPB) with respect to accuracy, performance, stability, and robustness. Boosting algorithms are ensemble methods used in supervised learning for regression and classification. They combine weak classifiers, i.e., classifiers that perform slightly better than random guessing to obtain robust classifiers. Each time a weak learner is added; the learning set is reweighted to give more importance to misclassified samples. Our test area, the IPB, is one of the oldest mining districts in the world and hosts giant volcanic-hosted massive sulfide (VMS) deposits. The spatial density of ore deposits, as well as the size and tonnage, makes the area unique, and due to the high data availability and number of known deposits, well-suited for testing machine learning algorithms. We combined several geophysical datasets, as well as layers derived from geological maps as predictors of the presence or absence of VMS deposits. Boosting algorithms such as BrownBoost and Adaboost were tested and compared to Logistic Regression (LR), Random Forests (RF) and Support Vector machines (SVM) in several experiments. We found performance results relatively similar, especially to BrownBoost, which slightly outperformed LR and SVM with respective accuracies of 0.96 compared to 0.89 and 0.93. Data augmentation by perturbing deposit location led to a 7% improvement in results. Variations in the split ratio of training and test data led to a reduction in the accuracy of the prediction result with relative stability occurring at a critical point at around 26 training samples out of 130 total samples. When lower numbers of training data were introduced accuracy dropped significantly. In comparison with other machine learning methods, Adaboost is user-friendly due to relatively short training and prediction times, the low likelihood of overfitting and the reduced number of hyperparameters for optimization. Boosting algorithms gave high predictive accuracies, making them a potential data-driven alternative for regional scale and/or brownfields mineral exploration.

  相似文献   

6.
In many higher education curricula, pre-structured step-by-step laboratory exercises in introductory courses in geographical information systems (GIS) are an important part of the training of future geographers. The reasons for this approach to teaching GIS are manifold, such as large numbers of students, off-the-shelf desktop software that is often complex, technical challenges, and scarce faculty resources. Often the reasons are well agreed upon by members of a university faculty and among the students. Research in other fields has shown that the use of a controlled manual for laboratory work often provides low learning potentials. However, not much empirical research has dealt with this issue within a GIS learning environment. Inspired by research on the value of student-generated questions within science education, the authors take a closer look at the type of student-generated questions and their relation to students' self-image of their learning approach in two pre-structured GIS laboratory settings at two Danish universities. They conclude that the vast majority of student-generated questions are of a basic information type and independent of the students' self-image of their learning approach. Further, it is found that wonderment questions, i.e. questions that are reflective in nature and show students the process towards acquiring extended geographical knowledge and software proficiency, are rarely asked.  相似文献   

7.
Seabed sediment textural parameters such as mud, sand and gravel content can be useful surrogates for predicting patterns of benthic biodiversity. Multibeam sonar mapping can provide near-complete spatial coverage of high-resolution bathymetry and backscatter data that are useful in predicting sediment parameters. Multibeam acoustic data collected across a ~1000 km2 area of the Carnarvon Shelf, Western Australia, were used in a predictive modelling approach to map eight seabed sediment parameters. Four machine learning models were used for the predictive modelling: boosted decision tree, random forest decision tree, support vector machine and generalised regression neural network. The results indicate overall satisfactory statistical performance, especially for %Mud, %Sand, Sorting, Skewness and Mean Grain Size. The study also demonstrates that predictive modelling using the combination of machine learning models has provided the ability to generate prediction uncertainty maps. However, the single models were shown to have overall better prediction performance than the combined models. Another important finding was that choosing an appropriate set of explanatory variables, through a manual feature selection process, was a critical step for optimising model performance. In addition, machine learning models were able to identify important explanatory variables, which are useful in identifying underlying environmental processes and checking predictions against the existing knowledge of the study area. The sediment prediction maps obtained in this study provide reliable coverage of key physical variables that will be incorporated into the analysis of covariance of physical and biological data for this area.  相似文献   

8.
X. Yao  L.G. Tham  F.C. Dai 《Geomorphology》2008,101(4):572-582
The Support Vector Machine (SVM) is an increasingly popular learning procedure based on statistical learning theory, and involves a training phase in which the model is trained by a training dataset of associated input and target output values. The trained model is then used to evaluate a separate set of testing data. There are two main ideas underlying the SVM for discriminant-type problems. The first is an optimum linear separating hyperplane that separates the data patterns. The second is the use of kernel functions to convert the original non-linear data patterns into the format that is linearly separable in a high-dimensional feature space. In this paper, an overview of the SVM, both one-class and two-class SVM methods, is first presented followed by its use in landslide susceptibility mapping. A study area was selected from the natural terrain of Hong Kong, and slope angle, slope aspect, elevation, profile curvature of slope, lithology, vegetation cover and topographic wetness index (TWI) were used as environmental parameters which influence the occurrence of landslides. One-class and two-class SVM models were trained and then used to map landslide susceptibility respectively. The resulting susceptibility maps obtained by the methods were compared to that obtained by the logistic regression (LR) method. It is concluded that two-class SVM possesses better prediction efficiency than logistic regression and one-class SVM. However, one-class SVM, which only requires failed cases, has an advantage over the other two methods as only “failed” case information is usually available in landslide susceptibility mapping.  相似文献   

9.
The statistical analysis of compositional data is of fundamental importance to practitioners in generaland to chemists in particular.The existing methodology is principally due to Aitchison,who effectivelyuses two transformations,a ratio followed by the logarithmic,to create a useful,coherent theory thatin principle allows the plethora of normal-based multivariate techniques to be used on the transformeddata.This paper suggests that the well-known class of Box-Cox transformations can be employed inplace of the logarithmic to significantly improve the existing methodology.This is supported in part byshowing that one of the most basic problems that Aitchison managed to overcome,namely thespecification of an interpretable covariance structure for compositional data,can be resolved,or nearlyresolved,once the ratio transformation has been applied.Hence the resolution is not directly dependenton the logarithmic transformation.It is then verified that access to the general Box-Cox family will allowa more accurate use of the normal-based multivariate techniques,simply because better fits to normalitycan be achieved.Finally,maximum likelihood estimation and some associated asymptotics are employedto construct confidence intervals for ratios of the true,unknown compositional constituents.Heretoforethis had not been done even in the context of the logarithmic transformation.Applications to real dataare presented.  相似文献   

10.
SEXIA is an expert system that uses a new methodological approach to identify foods,particularly oliveoils according to varieties,olive zones and denominations of origin.The methodological approachprovides identification tools,associating a confidence degree or a belief interval to the final hypotheses.The certainty factor and the Dempster-Shaffer theory,with some modifications,have been implementedin SEXIA.The computer can work with 50 chemical parameters whose data have previously beenacquired by the food analyst via a dialogue in the Spanish language.The system has been verified with144 olive oil samples.In this paper some results obtained for distinguishing the Arbequina variety fromother varieties using SEXIA and the BMDP stepwise discriminant analysis program are presented.Finally,promising directions for future research are suggested.  相似文献   

11.
The geospatial sensor web is set to revolutionise real-time geospatial applications by making up-to-date spatially and temporally referenced data relating to real-world phenomena ubiquitously available. The uptake of sensor web technologies is largely being driven by the recent introduction of the OpenGIS Sensor Web Enablement framework, a standardisation initiative that defines a set of web service interfaces and encodings to task and query geospatial sensors in near real time. However, live geospatial sensors are capable of producing vast quantities of data over a short time period, which presents a large, fluctuating and ongoing processing requirement that is difficult to adequately provide with the necessary computational resources. Grid computing appears to offer a promising solution to this problem but its usage thus far has primarily been restricted to processing static as opposed to real-time data sets. A new approach is presented in this work whereby geospatial data streams are processed on grid computing resources. This is achieved by submitting ongoing processing jobs to the grid that continually poll sensor data repositories using relevant OpenGIS standards. To evaluate this approach a road-traffic monitoring application was developed to process streams of GPS observations from a fleet of vehicles. Specifically, a Bayesian map-matching algorithm is performed that matches each GPS observation to a link on the road network. The results show that over 90% of observations were matched correctly and that the adopted approach is capable of achieving timely results for a linear time geoprocessing operation performed every 60 seconds. However, testing in a production grid environment highlighted some scalability and efficiency problems. Open Geospatial Consortium (OGC) data services were found to present an IO bottleneck and the adopted job submission method was found to be inefficient. Consequently, a number of recommendations are made regarding the grid job-scheduling mechanism, shortcomings in the OGC Web Processing Service specification and IO bottlenecks in OGC data services.  相似文献   

12.
Resource estimation of a placer deposit is always a difficult and challenging job because of high variability in the deposit. The complexity of resource estimation increases when drill-hole data are sparse. Since sparsely sampled placer deposits produce high-nugget variograms, a traditional geostatistical technique like ordinary kriging sometimes fails to produce satisfactory results. In this article, a machine learning algorithm—the support vector machine (SVM)—is applied to the estimation of a platinum placer deposit. A combination of different neighborhood samples is selected for the input space of the SVM model. The trade-off parameter of the SVM and the bandwidth of the kernel function are selected by genetic algorithm learning, and the algorithm is tested on a testing data set. Results show that if eight neighborhood samples and their distances and angles from the estimated point are considered as the input space for the SVM model, the developed model performs better than other configurations. The proposed input space-configured SVM model is compared with ordinary kriging and the traditional SVM model (location as input) for resource estimation. Comparative results reveal that the proposed input space-configured SVM model outperforms the other two models.  相似文献   

13.
珠江三角洲农业可持续发展研究   总被引:3,自引:0,他引:3  
黄小黎 《热带地理》2004,24(1):23-27
珠江三角洲作为全国著名的农业区,近年来在加速城市化进程中,农业的发展受耕地锐减、环境污染、农业劳动力素质下降等因素的制约.为适应城市化的要求和发挥地域优势,本区农业必须持续发展,为此应加速实现农业产业化,建设农业商品化生产基地,发展生态农业和观光农业.同时,应采取相应的对策措施:保护农业用地和生态环境,建立农业产业化保障体系,做好农业劳动力的技术培训工作,加强农业科技的研究和应用.  相似文献   

14.
The methods PARAFAC and three-way PLS are compared with respect to their ability to predictreversed-phase retention values.Special attention is paid to simple validatory tools,the meaning and useof which are explained.The simple validatory tools consist of percentages of explained variation in the training set and thosethat can be calculated with the use of markers.These markers are special(reference)solutes,the retentionvalues of which are used to gain information about a new object for which predictions are wanted.Different validatory tools can be calculated with the use of these marker retention values:percentagesof used variation and mean sum of squared residuals after applying the model to these marker retentionvalues.The validatory tools are evaluated on their power to estimate their test set counterparts:thepercentages of explained variation in the test set and mean sum of squared prediction errors in the test set.Two different data sets from reversed-phase chromatography are used to evaluate the validatory tools.The first data set has a high signal-to-noise ratio and is measured under the same measurementconditions.The second data set has a low signal-to-noise ratio and is measured under differentmeasurement conditions.Some of the simple validatory tools seem to have relevance to their test setcounterparts,even in the case of the second data set.  相似文献   

15.
Research on the diffusion of innovation has been dominated for many years by an approach to scientific reasoning developed most thoroughly and influentially by Hägerstrand. Recently there has been a shift away from this approach. However, this shift has not been previously noted in the literature. After presenting the characteristics of the two perspectives in terms of the instrumentalist and realist conceptions of science, this article provides some evidence for the shift and identifies several key issues for future research.  相似文献   

16.
GIS-T线性数据模型研究现状与趋势   总被引:13,自引:0,他引:13  
目前GIS-T已有若干线性数据模型,但大多数据模型还没得到实际应用,仍存在一些问题需要解决。该文分析讨论GIS-T线性数据模型,介绍了具有代表性的几种模型,并将各模型对车道及时态的支持进行比较分析和评估。指出当前GIS-T线性数据模型普遍存在的问题,提出基于三维的GIS-T时空数据模型是未来的发展趋势。  相似文献   

17.
Urban structure types (UST) are an initial interest and basic instrument for monitoring, controlling and modeling tasks of urban planners and decision makers during ongoing urbanization processes. This study focuses on a method to classify UST from land cover (LC) objects, which were derived from high resolution satellite images. The topology of urban LC objects is analyzed by implementing neighborhood LC-graphs. Various graph measures are examined by their potential to distinguish between different UST, using the machine learning classifier random forest. Additionally the influence of different parameter settings of the random forest model, the reduction of training samples, and the graph measure importance is analyzed. An independent test set is classified and validated, achieving an overall accuracy of 87%. It was found that the height of the building with the highest node degree has a strong impact on the classification result.  相似文献   

18.
This paper presents a new method (moving-windows) that optimizes diatom-based paleolimnological reconstructions of past environmental conditions from supra-regional training sets. The moving-window method identifies the best number of nearest neighbours (window size) from a merged supra-regional EDDI and local (MV) training set (n = 429) for each fossil diatom assemblage and the best type of transfer function (ML, WA-PLS) based on the error statistic of each transfer function (highest cross-validated R 2, lowest cross-validated average bias, maximum bias and RMSEP). At first we evaluated the moving-window approach by comparing measured TP-values with inferred TP-values using both the moving-window approach and the WA-PLS method. The relative errors of the moving-window approach were not significantly different for 208 samples that had an error <15 μg/l TP using the WA-PLS method. However, for the remaining 221 samples with errors >>15 μg/l TP using the WA-PLS method, the moving window approach significantly reduced the relative error of the inferred TP levels. Secondly, the moving- window approach was used to reconstruct epilimnetic total phosphorous (TP) for Lake Dudinghausen, Lake Rugensee, Lake Tiefer See and Lake Drewitzer See (Northern Germany) using both the supra-regional EDDI training set and a local training set from Northern Germany (MV training set). The moving-window inferred TP-levels of the four study lakes were compared with published reconstructed TP-values and with inferred TP-values based on the local MV training set. Overall, the moving-window and the published TP-trends agree well with each other. However, the moving-window reconstructions generally indicated lower TP-levels throughout the past ∼5,000 to 12,000 years, including past maxima. Thus, the moving-window method seems to generate more realistic absolute TP levels due to the optimized window size (highest number of modern analogues, best error statistics). The identification of more realistic absolute historic TP-values is important for the validation of reference conditions for inland waters. This study also demonstrates that a robust local training set may, similar to moving-window training sets, also lead to reliable reconstructions, if the geological settings of the local training set lakes and the study lakes are similar.  相似文献   

19.
Spatial optimization problems, such as route selection, usually involve multiple, conflicting objectives relevant to locations. An ideal approach to solving such multiobjective optimization problems (MOPs) is to find an evenly distributed set of Pareto‐optimal alternatives, which is capable of representing the possible trade‐off among different objectives. However, these MOPs are commonly solved by combining the multiple objectives into a parametric scalar objective, in the form of a weighted sum function. It has been found that this method fails to produce a set of well spread solutions by disregarding the concave part of the Pareto front. In order to overcome this ill‐behaved nature, a novel adaptive approach has been proposed in this paper. This approach seeks to provide an unbiased approximation of the Pareto front by tuning the search direction in the objective space according to the largest unexplored region until a set of well‐distributed solutions is reached. To validate the proposed methodology, a case study on multiobjective routing has been performed using the Singapore road network with the support of GIS. The experimental results confirm the effectiveness of the approach.  相似文献   

20.
元胞自动机被广泛应用于城市及其他地理现象的模拟,模拟过程中的最大问题是如何确定模型的结构和参数。该文提出一种基于分析学习的智能优化元胞自动机,该模型在逻辑回归模型的基础上,基于分析学习的智能方法,寻找元胞自动机模型的最佳参数。该方法允许用户控制空间变量影响权重,进而模拟出不同的城市发展模式,可为城市规划提供重要参考。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号