The approach that we suggest for describing, analyzing and interpreting spatial pattern has many pathways that depend on results and evaluation at various stages (Fig. 1). Initially, "disease observations" should be combined with relevant "contextual data" to create a spatially-explicit "disease/environment map" of where cases occurred. These first maps can take many different forms, including points that denote cases (and possibly non-cases), areas that indicate different prevalence or incidence values, and contextual data such as environmental, social, or economic variables that might be related to the disease pattern. Typically, contextual data are chosen based on some underlying theory of the possible causes of a disease, and/or additional knowledge of environmental associations that may be related. In this first phase, a form of Exploratory Spatial Data Analysis (ESDA), patterns are sought that might suggest apparent clusters of cases, areas with elevated incidence, associations between the magnitude of disease and components of the environment, or map series that imply shifts in spatial pattern over time. GIS facilitates ESDA by operations that can easily create choropleth maps, animation of temporal change, spatial patterns scaled to a chosen variable, and smoothed surfaces representing interpolation over the entire area. This iterative process ("no pattern" then "different analysis") is designed to uncover spatial or spatial-temporal patterns that can then be used to develop and more rigorously test for causal hypotheses. If this initial period of ESDA does not produce apparent associations, then further efforts may not be warranted ("stop"). This is an important phase in our approach, because further effort should not be devoted to extensive analysis if patterns are not apparent. In other words, we discourage random use of spatial-statistical methods without an a priori hypothesis and possible underlying mechanism, since spurious, statistically- significant findings could result but they may be of little use in interpretation or intervention.
If apparent patterns or associations are detected, then a "scientific hypothesis" for why and how this might explain the spatial pattern can be developed. From that hypothesis, one can "make predictions" that may obtain for specific conditions and the probable disease, which then logically lead to a "statistical hypothesis" involving clustering, environmental proximity, time-space diffusion, etc. Based on this statistical hypothesis, appropriate statistical tests can be applied to the data for evaluating whether one will "reject" or "not reject" the hypothesis. If the hypothesis is rejected, then a "new prediction" may be made based on a modified scientific hypothesis. If the statistical hypothesis is not rejected, implying that one can be relatively certain that the proposed underlying mechanism can account for the observed space-time pattern, then this insight could be used to inform a "public health intervention."
In some instances, particularly if a disease is severe, and the cost of inaction seems to outweigh the risk of misinterpretation, it may be possible to use the scientific hypothesis generated from the ESDA to design an intervention. The danger in jumping directly from an apparent pattern to intervention, without carefully exploring statistically the data, are obvious.
Particularly when intervention is not urgent, the iterative process of hypothesis testing can continue to explore and modify underlying theory. The ability to refine theory and modify hypotheses depends largely on the quality of the data and the complexity of the underlying mechanisms that have given rise to the observed pattern of disease. There are no general rules for how much effort should be put into modifying and retesting with different hypotheses, but the ultimate goal of improved understanding, and eventually intervention, should be the most important consideration.
Website maintained by Andy Long. Comments appreciated.
aelon@sph.umich.edu