spatstat is an R library for the statistical analysis of spatial data, mainly spatial point patterns.It is one of the largest contributed packages available for R, containing about 1000 user-level functions (about 500 documented functions) and a 700-page manual. It is the result of over 10 years' development by leading researchers in spatial statistics.
Ultimately, spatstat will handle all the major kinds of spatial data: point patterns, regional data, and geostatistical data. Currently, the vast majority of the functions deal with spatial point patterns. (This is unlikely to change in the near future).
- spatial point patterns in two-dimensional space.
This is the main historical focus of spatstat. Very complicated datasets can be handled. The point patterns may be `marked' by real numbers (e.g. trees annotated with their diameters), categorical values (e.g. ants labelled by species), logical values (e.g. on/off), etc. The spatial region where the points are observed can have a very complicated shape (an arbitrary polygon or a binary pixel image mask). Patterns of many thousands of points can be analysed.- spatial point patterns in 3-dimensional or higher-dimensional space.
The package supports point pattern datasets in any number of dimensions. At present, it has a modest capability to analyse 3-dimensional point pattern datasets. It supports multi-dimensional space-time point patterns in any number of dimensions. Further capabilities are under development.- spatial regions in two-dimensional space.
A region of space (such as the survey region for a geological or ecological survey) can be delimited by a polygon, or several polygons. Alternatively it can be specified by a binary pixel image.- pixel images in two-dimensional space.
- spatial patterns of line segments in two-dimensional space.
A list of line segments of finite length inside a region (for example, a map of geological faults in a survey region).- tessellations in two-dimensional space.
A tessellation is a division of space into non-overlapping regions, such as the states of the USA or the departements of France. Tessellations may be specified by polygonal boundaries or by pixel labels in a pixel image.
spatstat is designed to support a complete statistical analysis of a spatial point pattern dataset. It contains functions for
- data handling
- exploratory data analysis
- model-fitting
- simulation
- spatial sampling
- model diagnostics
- formal inference
Currently spatstat uses frequentist methods (maximum likelihood and approximations to maximum likelihood) to fit a wide variety of point process models. It can fit Poisson point process models, Gibbs point process models and random cluster process models to a point pattern dataset. The models can be spatially homogeneous, or inhomogeneous, with the spatial trend modelled as a function of the cartesian coordinates, and/or a function of spatial covariates. Gibbs models may include interpoint interaction (clustering or repulsion) and dependence on marks.Gibbs point process models are fitted by the method of maximum pseudolikelihood or by the Ogata-Huang approximation to maximum likelihood. The user interface is a function ppm similar to the R functions lm or glm, which uses a formula to describe the spatial inhomogeneity and the dependence on covariates or marks. Fitted Gibbs models can be simulated automatically.
Cluster process models are fitted by the method of minimum contrast. The implementation is experimental and will change in version 2 of the package.
Yes, and much more. spatstat provides facilities for formal inference (such as hypothesis tests) and informal inference (such as residual plots).If you want to formally test the hypothesis of Complete Spatial Randomness you can do this using the chi-squared test based on quadrat counts (quadrat.test), the Kolmogorov-Smirnov test based on values of a covariate (kstest), graphical Monte Carlo tests based on simulation envelopes of the K function (envelope), or the likelihood ratio test for parametric models (anova.ppm). You can also inspect the residuals from the uniform Poisson process model using diagnose.ppm.
spatstat provides similar facilities for checking many other point process models. The chi-squared test based on quadrat counts is available for any inhomogeneous Poisson process model fitted to data. Monte Carlo tests based on simulation envelopes are available for any fitted Gibbs model. Residuals and diagnostics are available for any fitted Gibbs model.
See the Spatial Task View on CRAN for an overview of packages for analysing spatial data. Some packages listed there are intended to provide basic data analysis capabilities, file conversion etc. spatstat is focused on higher-level statistical analysis of spatial data. It is based on statistical principles, and follows the standard practice and terminology of spatial statistics. It is intended to contain tools for performing all the standard, classical analyses of spatial point pattern data (such as the K-function) as well as a wide selection of modern methods (e.g. residual diagnostics). spatstat is not derived from, or dependent on, or superseded by, any other package in spatial statistics. Several packages depend or suggest using spatstat; for the current list, see the Reverse Depends: and Reverse Suggests: lists on the CRAN page for spatstat
This depends on the memory available to R. As a rough guide, if you are using a typical laptop with 2Gb of RAM, the following are feasible:
Plotting a point pattern ..... over 1 million points Exploratory analysis (K function, etc) ..... over 100,000 points Model-fitting ..... 20,000 points (unmarked) Complete analysis ..... about 10,000 points We are currently working on raising the practical limit on the number of points. Recent versions of spatstat (version 1.20-0 and later) can fit Gibbs point process models to patterns of 20,000 points. We are currently working on datasets with 300,000 to 1 million points.
This is possible in the latest version of spatstat (version 1.18-0 and later). However, at the moment, spatstat has limited capabilities to analyse such data. See the MarkedPointProcess package for additional capabilities.
The different curves are different estimates of the K-function (computed by different edge correction techniques) together with the theoretical K-function for a completely random pattern. For more detailed information, read this explanation
To control the range of r values, use the argument xlim as in plot(Kest(X), xlim=c(0, 7)). See help(plot.fv).The default range of r values that is plotted depends on the `default plotting range' of the object (of class 'fv') returned by Kest.
The default r values for Kest are computed as follows:
I should perhaps also point out that when you plot the K function, the range of r values that is plotted depends on the `default plotting range' of the object (of class 'fv') returned by Kest. To override this, add the argument `xlim' to the plot command.
When spatstat is first loaded, the default pixel dimensions are 100 x 100 for all of the above commands except predict.ppm, which has a default of 40 x 40. You can reset the default pixel dimensions by the command spatstat.options(npixel=c(nx, ny)) where nx, ny are the number of pixels in the x and y directions respectively. This does not apply to predict.ppm. Each of the commands (a)-(c) has an argument that controls the pixel dimensions in that particular case. (a) for density(X) where X is a point pattern density(X, dimyx=c(ny, nx)) (b) for as.im(f, W) where f is a number or function and W is a window M <- as.mask(W, dimyx=c(ny,nx)) as.im(X, M) (c) for predict(obj) where obj is a fitted model (class "ppm") predict(obj, ngrid=c(nx, ny)) The creation of new pixel grids is done by as.mask(). See help(as.mask) for explanation of the arguments dimyx = pixel dimensions = c(ny, nx) xy = pixel grid coordinates = list(x, y)
First, convert each of the regions into a separate owin object. Then apply union.owin to combine them. For more information, see the package vignette on shapefiles.