Spatio-Temporal Methods in Environmental Epidemiology

Spatio-Temporal Methods in Environmental Epidemiology is the first book of its kind to specifically address the interface between environmental epidemiology and spatio-temporal modeling. In response to the growing need for collaboration between statisticians and environmental epidemiologists, the book links recent developments in spatio-temporal methodology with epidemiological applications. Drawing on real-life problems, it provides the tools required to exploit advances in methodology when assessing the health risks associated with environmental hazards. Clear guidelines are given to enable the implementation of methodology and estimation of risks in practice.

Designed for graduate students in both epidemiology and statistics, the text covers a wide range of topics, from an introduction to epidemiological principles and the foundations of spatio-temporal modeling to new directions for research. It describes traditional and Bayesian approaches and presents the theory of spatial, temporal, and spatio-temporal modeling in the context of its application to environmental epidemiology .The text includes practical examples together with embedded R code and details of specific R packages and the use of other software including WinBUGS/OpenBUGS and INLA. Online resources associated with the book provide additional code, data, examples, exercises, lab projects, and more.

Representing a major new direction in environmental epidemiology, this book—in full color throughout—underscores the increasing need to consider dependencies in both space and time when modeling epidemiological data.. Students will learn how to identify and model patterns in spatio-temporal data and to exploit dependencies over both space and time in order to reduce bias and inefficiency.

From this book the reader will have gained an understanding of the following topics:

The basic concepts of epidemiology and the estimation of risks associated with environmental hazards.

Hierarchical modelling with a Bayesian framework.

The theory of spatial, temporal and spatio--temporal process needed for environmental health risk analysis.

Fundamental questions related to the nature and role of uncertainty in environmental epidemiology and methods which may help answer those questions.

Important areas of application within environmental epidemiology together with strategies for building the models that are needed and coping with challenges that arise.

Methods and software for the analysis and visualisation of environmental and health. Examples of R and WinBUGS code are given throughout the book and, together with data for the examples the code, are included in the online resources.

Offer a variety of exercises, both theoretical and practical, to assist in the development of the skills needed to perform spatio--temporal analyses.

New frontiers and areas of current and future research.

Chapter 1: Why spatio-temporal epidemiology?

This chapter provides a overview of methods for spatio-temporal modelling and their use in epidemiological studies

Chapter 2: Modelling health risks

This chapter contains the basic principles of epidemiological analysis and how estimates of the risks associated with exposures can be obtained. From this chapter, the reader will have gained an understanding of the following topics:

Methods for expressing risk and their use with different types of epidemiological study.

Calculating risks based on calculations of the expected number of health counts in an area, allowing for the age--sex structure of the underlying population.

The use of generalised linear models (GLMS) to model counts of disease and case--control indicators.

Modelling the effect of exposures on health and allowing for the possible effects of covariates.

Cumulative exposures to environmental hazards.

Chapter 3: The importance of uncertainty

This chapter contains a discussion of uncertainty, both in terms of statistical modelling and quantification but also in the wider setting of sources of uncertainty outside those normally encountered in statistics. From this chapter, the reader will have gained an understandingof the following topics:

Uncertainty can be dichotomised as either qualitative or quantitative, with the former allowing consideration of a wide variety of sources of uncertainty that would be difficult, if not impossible, to quantify mathematically.

Quantitative uncertainty can be thought of as comprising both aleatory and epistemic components, the former representing stochastic uncertainty and the latter subjective uncertainty.

Methods for assessing uncertainty including eliciting prior information from experts and sensitivity analysis.

Indexing quantitative uncertainty using the variance and entropy of the distribution of a random quantity.

Uncertainty in post-normal science derives from a wide variety of issues and can lead to high levels of that uncertainty with serious consequences. Understanding uncertainty is therefore a vital feature of modern environmental epidemiology.

Chapter 4: Embracing uncertainty: the Bayesian approach

This chapter introduces the Bayesian approach which provides a natural framework for dealing with uncertainty and also for fitting the models that will be encountered later in the book. From this chapter, the reader will have gained an understanding of the following topics:

The use of prior distributions to capture beliefs before data are observed.

The combination of prior beliefs and information from data to obtain posterior beliefs.

The manipulation of prior distributions with likelihoods to formulate posterior distributions and why conjugate priors are useful in this regard.

The difference between informative and non-informative priors.

The use of the posterior distribution for inference and methods for calculating summary measures.

Chapter 5: The Bayesian approach in practice

This chapter describes methods for implementing Bayesian models when their complexity means that simple, analytic solutions may not be available. From this chapter, the reader will have gained an understanding of the following topics:

Analytical approximations to the posterior distribution.

Using samples from a posterior distribution for inference and Monte Carlo integration.

Methods for direct sampling such as importance and rejection sampling.

Markov Chain Monte Carlo (MCMC) and methods for obtaining samples from the required posterior distribution including Metropolis–Hastings and Gibbs algorithms.

Using WinBUGS to fit Bayesian models using Gibbs sampling.

Integrated Nested Laplace Approximations (INLA) as a method for performing efficient Bayesian inference including the use of R–INLA to implement a wide variety of latent process models.

Chapter 6: Strategies for modelling

This chapter considers both some of the wider issues related to modelling and the generalisability of results and more technical material on the effect of covariates and model selection. From this chapter, the reader will have gained an understanding of the following topics:

Why having contrasts in the variables of interest is important in assessing the effects they have on the response variable.

The biases that may arise in the presence of covariates and how covariates can affect variable selection and model choice.

Hierarchical models and how that can be used to acknowledge dependence between observations.

There are issues with using p–values as measures of evidence against a null hypothesis. Basing scientific conclusions on it can lead to non-reproducible results.

The use of predictions from exposure models including acknowledging the additional uncertainty involved when using predictions as inputs to a health model.

Methods for performing model selection, including the pros and cons of automatic selection procedures.

Model selection within the Bayesian setting and how the models themselves can be incorporated into the estimation process using Bayesian Model Averaging.

Chapter 7: Is `real' data always quite so real?

This chapter considers some of the issues that will arise when dealing with ‘real data’. Data will commonly have missing values and may be measured with error. This error might be random or may be due to systematic patterns in how it was collected. From this chapter, the reader will have gained an understanding of the following topics:

Classification of missing values into missing at random or not at random.

Methods for imputing missing values.

Various measurement models including classical and Berkson.

The attenuation of regression coefficients under measurement error.

Preferential sampling, where the process that determines the locations of monitoring sites and the process being modelled are in some ways dependent.

How preferential sampling can bias the measurements that arise from environmental monitoring networks.

Chapter 8: Spatial patterns in disease

This chapter introduces disease mapping and contains the theory for spatial lattice processes and models for performing smoothing of risks over space. From this chapter, the reader will have gained an understanding of the following topics:

Disease mapping, where we have seen how to improve estimates of risk by borrowing strength from adjacent regions which can reduce the instability inherent in risk estimates (SMRs) based on small expected numbers.

Seen how smoothing can be performed using either the empirical Bayes or fully Bayesian approaches.

Been introduced to computational methods for handling areal data.

Learned about Besag’s seminal contributions to the field of spatial statistics including the very important concept of a Markov random field.

Explored approaches to modelling a real data including the conditional autoregressive models.

Seen how Bayesian spatial models for lattice data use WinBUGS, R and R–INLA.

Chapter 9: From points to fields: Modelling environmental hazards over space

This chapter contains the basic theory for spatial processes and a number of approaches to modelling point-referenced spatial data. From this chapter, the reader will have gained an understanding of the following topics:

Visualisation techniques needed for both exploring and analysing spatial data and communicating its features through the use of maps.

Exploring the underlying structure of spatial data and methods for characterising dependence over space.

Second-order theory for spatial processes including the covariance. The variogram for measuring spatial associations.

Stationarity and isotropy.

Methods for spatial prediction, using both classical methods (kriging) as well as modern methods (Bayesian kriging).

Non-stationarity fields.

Chapter 10: Why time also matters

The chapter contains the theory required for handling time series data. From this chapter, the reader will have gained an understanding of the following topics:

That a temporal process consists of both low and high frequency components, the former playing a key role in determining long-term trends while the latter may be associated with shorter-term changes.

Techniques for the exploratory analysis of the data generated by the temporal process, including the ACF (correlogram) and PACF (periodogram).

Models for irregular (high frequency) components after the regular components (trend) have been removed.

Methods for forecasting, including exponential smoothing and ARIMA modelling.

The state space modelling approach, which sits naturally within a Bayesian setting and which provides a general framework for most of the classical time series models and many more besides.

Implementing time series processes within a Bayesian hierarchical framework.

Chapter 11: The interplay between space and time in exposure assessment

In this chapter we have seen the many ways in which the time can be added to space in order to characterise random exposure fields. From this chapter, the reader will have gained an understanding of the following topics:

Additional power that can be gained in an epidemiological study by combining the contrasts in the process over both time and space while characterising the stochastic dependencies across both space and time for inferential analysis.

Criteria that good approaches to spatio–temporal modelling should satisfy.

General strategies for developing such approaches.

Separability and non-separability in spatio–temporal models, and how these could be characterised using the Kronecker product of correlation matrices.

Examples of the use of spatio–temporal models in modelling environmental exposures.

Chapter 12: Roadblocks on the way to causality: exposure pathways, aggregation and other sources of bias

This chapter contains a discussion of the differences between causality and association. It also covers specific issues that may be encountered in this area when investigating the effects of environmental hazards on health. From this chapter, the reader will have gained an understanding of the following topics:

Issues with causality in observational studies.

The Bradford–Hill criteria which are a group of minimal conditions necessary to provide adequate evidence of a causal relationship.

Ecological bias which may occur when inferences about the nature of individuals are made using aggregated data.

The role of exposure variability in determining the extent of ecological bias.

Approaches to acknowledging ecological bias in ecological studies.

Concentration and exposure response functions.

Models for estimating personal exposures including micro-environments.

Chapter 13: Better exposure measurements through better design

This chapter looks at the emergence of a central purpose; to explore or reduce uncertainty about aspects of the environmental processes of interest. One form of uncertainty, aleatory, cannot be reduced by definition whereas with the other, epistemic, where uncertainty can be reduced (see Chapter 3). However that reduction does not stop the original network from becoming sub-optimal over time, pointing to the need to regularly reassess its performance. From that perspective we see that the design criteria must allow for the possibility of ‘gauging’ (adding monitors to) sites that

Maximally reduce uncertainty at their space–time points (measuring their responses eliminates their uncertainty);

Best minimise uncertainty at other locations;

Best inform about process parameters;

Best detect non-compliers.

From this chapter, the reader will have gained an understanding of many of the challenges that the network designer may face. These involve the following topics:

A multiplicity of valid design objectives.

Unforeseen and changing objectives.

A multiplicity of responses at each site, i.e. which should be monitored.

A need to use prior knowledge and to characterise prior uncertainty.

A need to formulate realistic process models.

A requirement to take advantage of, and integrate with, existing networks.

The need to be realistic, meaning to contend with economic as well as administrative demands and constraints.

Chapter 14: New frontiers

In this chapter the reader will have encountered a selection of new frontiers in spatio– temporal epidemiology including the following:

A number of areas that are currently under active development.

Two modern approaches to addressing the problem of non-stationarity in random spatio– temporal fields; warping and dimension expansion.

How dimension expansion can be used to dramatically reduce non-stationarity and suggest its possible causes.

A powerful approach combining both physical and statistical modelling within a single framework.