spatstat is an R library for the statistical analysis of spatial data, mainly spatial point patterns.It is one of the largest contributed packages available for R, with about 300 user-level functions and a 500-page manual.
Mainly, spatial point patterns in two-dimensional space.Very complicated datasets can be handled. The point patterns may be `marked' by real numbers (e.g. trees annotated with their diameters), categorical values (e.g. ants labelled by species), logical values (e.g. on/off), etc. The spatial region where the points are observed can have a very complicated shape (an arbitrary polygon or a binary pixel image mask). The point pattern data can be accompanied by other kinds of covariate data, such as a line segment pattern (e.g. map of geological faults) or a pixel image (e.g. map of terrain elevation). Patterns of many thousands of points can be analysed.
Currently, spatstat deals only with two-dimensional space, and does not handle 3D or space-time data. (This will change in version 2).
Ultimately, spatstat will handle all the major kinds of spatial data: point patterns, regional data, and geostatistical data. Currently, the vast majority of the functions deal with spatial point patterns. (This is unlikely to change in the near future).
spatstat is designed to support a complete statistical analysis of a spatial point pattern dataset. It contains functions for
- data handling
- exploratory data analysis
- model-fitting
- simulation
- spatial sampling
- model diagnostics
- formal inference
spatstat can fit Poisson point process models, Gibbs point process models and random cluster process models to a point pattern dataset. The models can be spatially homogeneous, or inhomogeneous, with the spatial trend modelled as a function of the cartesian coordinates, and/or a function of spatial covariates. Gibbs models may include interpoint interaction (clustering or repulsion) and dependence on marks.Gibbs point process models are fitted by the method of maximum pseudolikelihood or by the Ogata-Huang approximation to maximum likelihood. The user interface is a function ppm similar to the R functions lm or glm, which uses a formula to describe the spatial inhomogeneity and the dependence on covariates or marks. Fitted Gibbs models can be simulated automatically.
Cluster process models are fitted by the method of minimum contrast. The implementation is experimental and will change in version 2 of the package.
Yes, and much more. spatstat provides facilities for formal inference (such as hypothesis tests) and informal inference (such as residual plots).If you want to formally test the hypothesis of Complete Spatial Randomness you can do this using the chi-squared test based on quadrat counts (quadrat.test), the Kolmogorov-Smirnov test based on values of a covariate (ks.test.ppm), graphical Monte Carlo tests based on simulation envelopes of the K function (envelope), or the likelihood ratio test for parametric models (anova.ppm). You can also inspect the residuals from the uniform Poisson process model using diagnose.ppm.
spatstat provides similar facilities for checking many other point process models. The chi-squared test based on quadrat counts is available for any inhomogeneous Poisson process model fitted to data. Monte Carlo tests based on simulation envelopes are available for any fitted Gibbs model. Residuals and diagnostics are available for any fitted Gibbs model.
(To be completed!)
We have carried out a complete analysis on an astronomical dataset containing 4300 points.
Plotting a point pattern ..... over 1 million points Exploratory analysis (K function, etc) ..... 100,000 points Model-fitting ..... 5,000 points Complete analysis ..... over 4,000 points
Currently you can only attach a single mark variable (e.g. diameter) to each point; spatstat does not support multiple marks. This capability will be added in version 2. Currently the only package which supports analysis of such data is the MarkedPointProcess package.
The different curves are different estimates of the K-function (computed by different edge correction techniques) together with the theoretical K-function for a completely random pattern. For more detailed information, read this explanation
To control the range of r values, use the argument xlim as in plot(Kest(X), xlim=c(0, 7)). See help(plot.fv).The default range of r values that is plotted depends on the `default plotting range' of the object (of class 'fv') returned by Kest.
The default r values for Kest are computed as follows:
I should perhaps also point out that when you plot the K function, the range of r values that is plotted depends on the `default plotting range' of the object (of class 'fv') returned by Kest. To override this, add the argument `xlim' to the plot command.
When spatstat is first loaded, the default pixel dimensions are 100 x 100 for all of the above commands except predict.ppm, which has a default of 40 x 40. You can reset the default pixel dimensions by the command spatstat.options(npixel=c(nx, ny)) where nx, ny are the number of pixels in the x and y directions respectively. This does not apply to predict.ppm. Each of the commands (a)-(c) has an argument that controls the pixel dimensions in that particular case. (a) for density(X) where X is a point pattern density(X, dimyx=c(ny, nx)) (b) for as.im(f, W) where f is a number or function and W is a window M <- as.mask(W, dimyx=c(ny,nx)) as.im(X, M) (c) for predict(obj) where obj is a fitted model (class "ppm") predict(obj, ngrid=c(nx, ny)) The creation of new pixel grids is done by as.mask(). See help(as.mask) for explanation of the arguments dimyx = pixel dimensions = c(ny, nx) xy = pixel grid coordinates = list(x, y)
Currently this is not possible inside spatstat. You need to use another polygon-handling package to determine which edges are part of the exterior border.