Work my through the Spatial Simulation: Exploring Pattern and Process book by David O’Sullivan and George L.W. Perry. This is to formally develop my experience in spatial simulation, as well as record my learnings.
To aid this development, I will use R. Wherever possible, I will aim to use randomly generated data. This is to ensure that the fundamentals are learned, and can be generalised to other datasets.
Scientific models (e.g. spatial simulation models) are simplified representations of the world that are deliberately developed with the particular purpose of exploring aspects of the world around us (p.1).
Two different views for models:
Scientific models are a simplified representation of a system under study, which can be used to explore, to understand better or to predict the behaviour of the system it represents (p. 3).
The process of model development demands that we simplify from the often bewildering complexity of the real world by deciding what matters (and what does not) in the context of the current investigation (p. 4).
The first step in any modelling exercise is the development of a conceptual model.
In the systems analysis process, we typically break a real-world phenomenon or system down into general elements:
A conceptual model can be converted to a mathematical model (i.e. computer/simulation model) where component states are represented by variables and the relationships between them by mathematical equations.
In a simulation model, a computer is programmed to iteratively recalculate the modelled system state as it changes over time in accordance with the relationships represented by the mathematical and other relationships that describe the system. The most widely used simulation models are systems dynamics models.
Systems dynamics models are effectively computer simulations of mathematical models consisting of interrelated differential equations, although they often also include empirically derived relationships and semi-quantitative ‘rules’.
In Spatially explicit model we decide that representing distinct spatial elements and their relationships is important for a full understanding of the system. Thus, a spatially explicit model has elements representing individual regions of geographical space or individual enities located in space (such as trees, animals, humans, cities and so on). Each region and entity has its own set of state variables and interacts continuously through time with other regions and entities, typically interacting more strongly with those objects that are nearest to it. These interactions may be governed by mathematical equations, by logical rules or by other mechanisms, and they directly represent mechanisms that are thought to operate in the real-world system represented by the simulation.
Systems for which we have reliable, detailed data and whose underlying mechanisms we understand well are the ones most suitable for predictive modelling.
Using models for prediction
The lack of data or of good understanding of the field in question are reasons for being cautious about prediction. It is also important to be wary of the prospects for prediction in contexts where although individual elements are well understood, the aggregation of many interacting elements can lead to unexpected nonlinear effects.
Models as guides to data collection
Where the most pressing deficits in our knowledge of systems are empirical, so that we lack good observational data about actual behaviour, the most important role for models may be to help us decide which data are most critical to improving our understanding.
If the underlying processes can be modelled, then we can perform sensitivity analysis on the models to find out which parameters have the strongest effect on the likely outcomes. The data most essential to improving our ability to forecast outcomes more reliably can be colected.
Models as ‘tools to think with’
No shortage of empirical data, but we have limited understanding of how a system works. Simulation models can become ‘tools for thought’.
Simulation models that represent whatever working hypotheses we have about the target system and then experiment on the model to see what the implications of those theories and hypotheses might be in the real world.
However, it is beset by the equifinality problem, which refers to the fact that large numbers (in fact an infinite number) of possible models, including the same model with different input data, could produce behaviour consistent with observational data. Identifying the processes responsible for a given pattern, on the basis of that pattern alone, is problematic.
Some systems of interest to social and environmental scientists are not amenable to experiment for rather prosaic reasons. Archaeologists, anthropologists and palaeontologists are interested in systems that no longer exist.
An additive process: the normal (Gaussian) distribution
Approximately Gaussian distributions arise as the result of a series of independent additive processes, as explained by the (additive) central limit theorem.
Multiplicative processes: various skewed distributions
A different distribution will arise in systems dominated by multiplicative processes (e.g. lognormal distribution). Lognormal is not the only distribution generated by multiplicative processes. Zipf, Pareto and power-law distributions can also occur.
Many (perhaps most) processes in environmental and social systems are multiplicative rather than additive (e.g. population growth is multiplicative).
The key idea behind this book is that if we are alert to the similarities between patterns across diverse domains then perhaps the loss of generality as we move from abstract to more specific models can be mitigated to a degree.
e.g. visitors to an art gallery looking for a celebrated work of art and caribou searching for food in the tundra are clearly very different, but is there any commonality in these’foraging’ processes that means we can, at least in the first instance, think about their representation in the same way?
Many ostensibly different spatial patterns, arising in very different systems, can usefully be represented using similar frameworks.
Developing detailed, dynamic, spatial models comes at some cost in generality and interpretability, but buys us realism and the ability to represent specific processes in specific contexts.
The processes that are dominant in a system leave distinctive fingerprints in the form of dynamic spatial patterns. However, While the careful characterisation of pattern in a system can help us to identify candidate driving processes or, perhaps more realistically, rule out less important ones, it cannot unequivocally isolate the processes responsible for the observed patterns.
There are also difficult issues related to the scales at which we detect and perceive patterns, how patterns change with scale and the inferences we can make at scales different from those at which we have made observations.
What is a pattern?
patterns are the discernible outcomes or signatures of the processes operating in a given system - defining characteristics of a system and often, therefore, indicators of essential underlying processes and structures.
Spatial patterns can be represented using a range of data models:
What is a process?
The spatial and temporal patterns that we observe in the systems we are investigating are the outcomes of the processes operating in those systems - The process is any mechanism that causes a system to change its state, and so potentially to produce characteristic patterns.
Processes generate patterns, but patterns are the outcomes of processes and can suppress or amplify those same proceses.
Scale
Scale colours any consideration of pattern and process-it affects how we perceive pattern, how we represent process and how each affects the other - scale denotes the resolution within the range or extent of a measured quantity.
In the context of spatial analysis and modelling, scale can be decomposed into grain and extent:
scale-dependence: the patterns that we perceive in a given system are a function of the space- time scale at which we view it.
Some patterns (i.e. fractal) do not change with scale-that is, they are not scale-dependent -
Processes can also be scale-dependent - A particular process will likely reveal its effects most strongly across a limited range of scales, and so as we change the scale at which we look at a system the dominant suite of processes will shift.
Processes that occur rapidly but whose effects are felt much more slowly are particularly difficult to pin down.
Phenomenological models emphasise reproducing observed patterns and not the mechanisms responsible for generating them.
Mechanistic models are more concerned with the underlying processes, using functions and distributions based on theoretical expectation.
Characterising Point patterns: first- and second-order structure
First-order structure relates to general (global) trends, such as, for example, variation in the density (or intensity) of events across a plot as a function of some underlying covariate. In a pattern solely structured by first-order effects the location of any individual event is independent of the location of other events in the pattern (i.e. non-stationary or inhomogeneous patterns).
If the location of events is influenced by that of other events-that is, the events interact in some way-this is termed second-order structure.
Teasing the two effects apart, while difficult, is important because the types of mechanisms generating the two effects are different and so the inferences we might draw about generating process are also different.
If we can characterise a pattern by separating out first- and second-order components, then the direct link to the types of processes responsible might allow us to rule out some possibilities even if we remain unable to isolate the specific generating mechanisms responsible for the pattern.
Using null models to evaluate patterns
We often want to know how well our data match what we would expect if a given null model (often complete spatial randomness) described the data adequately.
Deciding on an appropriate null model is important: in spatial analysis a model of CSR is often used but this is a limited approach.
If we want to tease such effects apart we need to use a null model that includes first- but not second-order structure. Rejection of that null model would allow us to evaluate the extent to which first-order structure alone accounts for any observed aggregation.
A second key step in this procedure is deciding on a summary measure of the data to compare with the null model. Ideally we want a statistic that effectively summarises all the detail in the data-a so-called sufficient statistic.
library(spatstat)## Warning: package 'rpart' was built under R version 3.4.3plot(envelope(murchison$gold, Kest, nsim = 99, verbose = FALSE), main = "K-function - 99 simulations")CSR model does not adequately describe the data. K(r) is above the CSR envelope and suggests that the locations of Murchison gold are clustered.However, the use of CSR as the null hypothesis does not identify first-an second-order effects.
Density-based (first-order) null models
The basic null model for point patterns is CSR, known more formally as the homogeneous Poisson process.
CSR follows a Poisson distribution. We can estimate the intensity as the number of events in the pattern divided by the focal area (i.e. divided by the density).
Simulating a realisation of the homogeneous Poisson process is straightforward: simply draw an (x, y) coordinate pair for each event from a uniform distribution.
plot(runifpoint(n = murchison$gold$n, win = murchison$gold$window, nsim = 3), main = "Homogeneous\nPoisson process")However, we expect the underlying intensity (number of events in the pattern divided by the focal area) of the process to vary across space (e.g plants may respond to variability in soil nutrient and water availability so that their density varies in space).
The spatial structure of this inhomogeneous Poisson processes may take many forms from simple linear gradients to more complex structuring.
# intensity increases as x increases. 
plot(rpoispp(function(x,y) {1000*x}, nsim  =3), main = "Inhomogeneous Poisson processes")# Testing uniform Poisson process with intensity 5 in a 10 x 10 square (i.e. CSR) against poisson point pattern simulations.
# As expected, poisson point pattern models do not adequately describe the CSR data. 
plot(envelope(rpoispp(lambda = 10,  win=owin(c(0,10),c(0,10))), Kest, nsim = 99, simulate = expression(rpoispp(function(x,y) {1000*x})), win=owin(c(0,10),c(0,10)), verbose = FALSE), main = "K-function - 99 simulations\nInhomogeneous Poisson processes")Interaction-based (second-order) null models
While representing first-order structure, the inhomogeneous Poisson process does not consider the structure arising from the interactions between events. However, the location at which events occur is known and expected not to be independent of location or from other events (e.g. shrubs in arid environments tend to show regular patterns as they compete for scarce water resources and as a result ‘repel’ one another).
A phenomenological model (i.e. reproduces observed patterns and not the mechanisms responsible for generating them) incorporating interaction between events is simple sequential inhibition (SSI). In SSI the location of events is constrained so that events are sequentially placed at randomly generated locations, but new events cannot be placed within some inhibition distance d of any already placed event.
We may also want to simulate patterns where the events are clustered in space. Such clustering might arise in the distribution of conspecific trees in forests, crime incidents associated with gang activities or the incidence of cases of a contagious disease.
Neyman-Scott processes capture these kinds of second-order effects:
The default for the spatstat rNeymanScott function is the Matern process - points centred on the parent points are uniformally distributed (Thomas process are normally distributed).
nclust <- function(x0, y0, radius, n) {
              return(runifdisc(n, radius, centre=c(x0, y0)))
            }
plot(rNeymanScott(kappa = 5, expand = 0.1, nclust, radius=0.1, n=100, nsim = 3), main = "Neyman-Scott (Matern) Point Process")The basic Neyman-Scott process provides a point of departure for simulating more complicated spatial patterns. An obvious modification is to include first-order structure such that the density of the clusters across the plot varies.
Another possibility is to generate multi-generational processes. For example, a double Thomas process uses the first generation of offspring as cluster centres for a second generation, with the different generations possibly varying in the diffuseness of their clustering.
Inferring process from (spatio-temporal) pattern
Having collected snapshot data, such as the spatial distribution of an outbreak of disease or crime hotspots, we typically want to infer the processes or mechanisms that can explain those data (i.e. we want to relate any observed patterns to the processes that generated them).
The null model strategy introduced above allows us to take some tentative steps in this direction. However, we may have some understanding of how the components of a system work, and we want to know how they interact to produce broader patterns of outcomes.
How can we link patterns to processes? This question is particularly important for dynamic spatial models where pattern and process are invariably closely intertwined.
Top-down Approaches: process from pattern
Bottom-up approaches: pattern from process
Linking pattern and process: some inferential traps
The equifinality problem is closely related to and difficult to separate from the issue of under-determination or nonuniqueness:
Spatial heterogeneity - Conditions locally are more likely to be similar to one another than they are when we expand the scope of our study (e.g. geology, soils and climate all vary less in local areas than they do globally).
From a simulation modelling perspective the interesting challenge is to identify process models that generate the heterogeneous landscapes.
The key feature of all of these heterogeneous models is that they favour the local aggregation of similar things. This aggregation may express itself in a process such that model elements change to become more like neighbouring elements, but an important insight is realising that aggregation and segregation are two sides of the same coin, so that aggregation may also arise from a segregative mechanism favouring the separation of dissimilar entities.
Basics of (discrete spatial) model structure
A grid of cells is often referred to as a lattice of sites and ‘wraps’ around from left to right and from top to bottom, so that cells in the extreme left-hand column have as neighbours those in the extreme right-hand column as well as their immediate neighbours. This simulates an infinite space and avoids model artifacts due to edge effects, and is called toroidal wrapping.
The most common neighbourhood definitions are either the four orthogonal immediate neighbours (the von Neumann neighbourhood) or the eight adjacent neighbours (orthogonal and diagonal, the Moore neighbourhood).
At any given time each cell has some state, generally a number, often restricted to integer values when cell states are discrete. Change occurs iteratively, by updating cell states in accordance with transition rules governing how the current state and the state of neighbours in the grid lead to new cell states.
During each tick, one cell may change state. A generation is a number of ticks equal to the number of grid cells, so that during a generation every cell may have an opportunity to update its state, although due to the (usually) random selection of cells for update.
The most obvious way to make places similar to their neighbours is to repeatedly apply a local averaging procedure so that each location’s current value moves towards the mean value of its neighbours. Local values that are high relative to neighbouring values are dragged down by this averaging process, while values low relative to neighbouring values are pulled up.
An initially random surface moves rapidly towards homogeneity as local averaging is applied - The point to note is that the averaging process changes the pattern from completely random to one that is separated into distinct regions of high and low values.
library(raster)## Warning: package 'raster' was built under R version 3.4.4## Warning: package 'sp' was built under R version 3.4.4#empty raster with ncells = 50*50
loc_avg <- raster(ncol=50, nrow=50)
# assign random values between 0 and 1 to each raster cell
values(loc_avg) <- runif(n = ncell(loc_avg), min = 0, max = 1)
plot(loc_avg)loc_sample <- function(raster, prop) { 
  
  # calculates the proportion of cells in raster for below function
  prop_cells <- ncell(raster) * prop
  # samples raster to only keep random proportion of raster cells
  rast <- sampleRandom(x = raster, size = prop_cells, ext = raster, asRaster = TRUE)
  # calculates mean of cells in 3x3 window. NAs ignored and wraps around from left-to-right via pad = TRUE
  rast_mean <- focal(rast, w=matrix(1,nrow=3,ncol=3), fun=mean, na.rm=TRUE, pad=TRUE)
  # fills NA cells with cells from the original raster
  rast_mean <- cover(x = rast_mean, y = raster)
  
  return(rast_mean)
}nsims <- 100
rast_vals <- list()
rast_loc <- loc_avg
rast_stack <- stack()
for (i in 1:nsims) { 
  # applies loc_sample function to rast_loc
 rast_loc <- loc_sample(raster = rast_loc, prop = 0.5) 
 # adds data frame of raster values and iteration number to list element
 rast_vals[[i]] <- data.frame(val = as.vector(rast_loc), iteration = i)
  rast_stack <- stack(rast_stack, rast_loc)
}The convergence to the local average can be demonstrated by a boxplot of the values in each raster iteration. As the number of iterations increases, the range of values decrease and converge to the average.
library(dplyr)
library(ggplot2)## Warning: package 'ggplot2' was built under R version 3.4.4# Boxplot to visualise the convergence of the local average values of the raster per iteration
rast_vals %>%
  do.call(rbind, .) %>%
  ggplot(aes(x = factor(iteration), y = val)) +
  geom_boxplot(outlier.alpha = 0) + 
  labs(x = "Iteration", y = "Raster Cell Values") + 
  scale_x_discrete(breaks=seq(0,nsims,10)) + 
  theme_bw()Local averaging with noise
Addng a random element in the local averaging process ensures that the system remains in a constant state of flux.
It is apparent that these ‘random’ data feature considerable structural complexity. This complexity arises because extreme high (or low) values in the underlying surface must necessarily be near one another, since reaching extreme values is dependent on having other extreme values nearby.
The model embodies two essential features-aggregation and diffusion - Aggregation is represented by the summation of the averaging process, while diffusion results from the iterated summation across a neighbourhood, which over time spreads the total amount of ‘development’ around. The random component ‘seeds’ the system with latent spatial structure which averaging locks in over multiple iterations.
# #empty raster with ncells = 50*50
# loc_avg <- raster(ncol=100, nrow=100)
# empty_raster <- raster(ncol = 100, nrow = 100)
# # assign a value of 0 to each raster cell
# values(loc_avg) <- 0
# values(empty_raster) <- 0
#   
# breakpoints <- c(-1.0, -0.9, -0.8, -0.7, -0.6, -0.5, -0.4, -0.3, -0.2, -0.1, 0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0)
# colors <- heat.colors(21)
#   
# plot(loc_avg, main = paste0("Iteration: ", 1), breaks = breakpoints, col = colors)# loc_avg_noise <- function(raster, threshold) {
#
#   random_vals <- round(runif(n = ncell(raster), min = 0, max = 1))
#
#   random_vals[random_vals == 0] <- -1
#
#   raster_mean <- focal(raster, w= matrix(c(0,1,0,1,1,1,0,1,0), nrow = 3), fun=mean, na.rm=TRUE, pad=TRUE) + random_vals
#
#   empty_raster[raster_mean > threshold] <- 1
#
#   return(empty_raster)
# }# nsims <- 10
# rast_loc <- loc_avg
# 
# for (i in 1:nsims) { 
#   # applies loc_avg_noise function to rast_loc
#  rast_loc <- loc_avg_noise(raster = rast_loc, threshold = 2)
# 
# plot(rast_loc, main = paste0("Iteration: ", i))
# }Totalistic automata are cellular automata (CA) where the rules governing cell state changes are expressed solely in terms of the sum of the state values of neighbouring cells. Where the next cell state also depends on the current state of the cell itself and not solely on the local neighbourhood sum of state values) the automata is sometimes referred to as outer totalistic.
Majority rules
A majority rule specifies that where one state is in the majority in a neighbourhood, the next state of the central cell will be the local majority state.
library(raster)
#empty raster with ncells = 50*50
loc_avg <- raster(ncol=50, nrow=50)
values(loc_avg) <- 1
x <- sampleRandom(loc_avg, ncell(loc_avg)*.3, asRaster=TRUE)
x[is.na(x)] <- 0
plot(x)library(raster)
#empty raster with ncells = 50*50
loc_avg <- raster(ncol=50, nrow=50)
values(loc_avg) <- 1
x <- sampleRandom(loc_avg, ncell(loc_avg)*.3, asRaster=TRUE)
x[is.na(x)] <- 0
plot(x)