
A science of epidemiology needs at least three things: 1) theories
regarding how patterns of disease are generated in populations,
2) observations relevant to those theories, and 3) methods that
link theory and observation. Epidemiologists who wish to advance
their science should know how to develop the needed theory. This
class deals with one important aspect of that task, namely how
to use dynamic models of population processes in formulating and
evaluating epidemiological theory. Theories expressed as dynamic
models are more complete and informative than the most common
type of theory in epidemiology which state that an exposure promotes
the development of disease in exposed individuals.
Theory formulation in the biological sciences, including epidemiology,
involves abstracting the essence of a processes of interest while
disregarding complexities that obscure the particular issue being
addressed. Well formulated theories in epidemiology provide a
framework for thinking about what determines the patterns of disease
in populations. In addition, theories provide explanations for
observed disease patterns. They also provide a means for predicting
patterns of disease under different conditions. A productive way
of formulating epidemiological theories, exploring their implications,
and determining where to seek data relevant to these theories
is through the use of computer models. Many different modeling
approaches can be used.
This class addresses a simple modeling technique using a computer
point and click approach. The models constructed are deterministic,
compartmental models. Deterministic models are ones where chance
does not play a role and the model defines an exact outcome. Compartmental
models are characterized by the fact that nothing is transformed
from one type of entity into an incommensurate entity. Physics
models that transform force and mass into acceleration are not
compartmental models. Neither are chemistry models that transform
two chemicals into two other chemicals. Our models will mostly
be of human populations. They will have humans flowing from one
compartment to another as their exposures, diseases, or other
conditions change. But the human units of our populations will
not change.
Compartmental models represent one of many different types of
models that epidemiologists should find useful for constructing
theory. We focus upon compartmental models for three reasons:
1) they provide useful structure on which to build theory about
epidemiological processes, 2) learning how to implement them on
small computers is easy, and 3) they create a basis for learning
and understanding other approaches to modeling dynamic processes.
The compartmental models taught in this course treat segments
of populations as infinitely divisible continuous entities. There
are no individuals in the populations modeled. There are only
segments of continuous population. The flows from one compartment
to another, however, correspond to the movement of individuals
from one category of exposure or disease to another. Compartmental
models can simplify things in ways that clarify epidemiological
concepts. They focus our thinking on population phenomenon in
a way which standard epidemiological methods don't. They can thus
expand our search for ways to prevent disease from things that
affect individual risks to things that have to do with the ways
that populations behave.
Most teaching in epidemiology is about how to make observations
and how to describe and analyze relationships between exposures
and diseases in individuals. How to develop theory about the processes
which determine the patterns of disease in populations has not
been a major focus of epidemiologic teaching, even on the doctoral
level. The paths to developing biologically and sociologically
relevant theory are rarely discussed and there are few epidemiologists
who dedicate themselves to exploring and charting these paths.
This class charts some aspects of a particular path. This path
can make it possible for any epidemiologist to begin penetrating
the complexity of biological and sociological systems which generate
patterns of disease in populations. This path, namely the use
of compartmental models, has a long history of mathematical and
numerical developments. In the course of that history computer
programs have been developed which make the construction and analysis
of compartmental models accessible even to those who lack a mathematical
or computer programming background.
There are several other modeling methods that can be used by a
science of epidemiology. The first is the standard method taught
in epidemiology. This is to develop and test hypotheses about
exposures of individuals which act within or upon those individuals
to increase their risk of disease. The key difference here is
the focus on risks in individuals rather than dynamic process
in population systems. While this is clearly a productive method,
we argue that it has a narrow scope and that it obscures some
of the most important ways that disease can be controlled. Standard
epidemiological methods are too focused on causes that act directly
in or upon individuals at risk. They misdirect the search for
prevention away from various important causes such as patterns
of relationship between individuals, the organizational structure
of systems, and dynamic processes with feedbacks. Moreover, even
in pursuing the goal of identifying risk factors, the traditional
approaches of epidemiology have serious limitations which models
of population systems do not. Because success of the standard
approach is often dependent upon untenable assumptions, it misses
some risk factors affecting individuals and distorts effect measures
for others. One of the reasons for pursuing the path of compartmental
models is that it points out the narrow scope and limitations
of the currently dominant path in epidemiology.
Another path is the use models with discrete individuals that
interact with their envirionments and with other individuals.
There are a broad range of such models. Some include stochastic
(chance) events. Some are deterministic. Some are completely described
by sets of rules which individuals in the models use to determine
their next action. Some, in contrast, don't have the individual
make decisions but rather have the fate of individuals determined
by the central hand of an event scheduler. All of these valuable
modeling approaches are beyond the scope of this class. But let
us just say a few more words about them to provide a glimpse of
the modeling world beyond compartmental models.
Discrete individual models allow for the design of studies and
the collection of data in the same way that epidemiologists would
design studies and collect data in the real world. Thus they have
a special usefulness in epidemiology for evaluating effectiveness
and efficiency of different study designs and analytic methods.
Many of the questions epidemiologists face have to do with how
to organize a study and how to analyze the data collected from
a study. We are most often guided in these tasks more by tradition
than firm understanding. We very often lack adequate theory that
the study designs and analytic procedures which we employ are
going to lead us to the truth. That is to say that on the basis
of theory alone we cannot say that our methods are going to lead
to valid measures of effect. If we can see that in a model system
our procedures lead us into errors, that is to say if we can see
that in our model system our parameter estimates are biased or
invalid, then we can be quite confident that the real world, with
its far greater complexity, will lead us into even greater errors.
While the real world never provides us with a way to check out
the validity of our methods, individual based models at least
provide a limited means to do this.
Besides allowing us to check for validity, discrete individual
models also can be used as a tool in the design of studies with
greater precision. The precision of a parameter estimate is inverse
to the width of its confidence interval. We usually choose some
study design and sample size to achieve some desired level of
precision. The methods we use to calculate sample sizes represent
gross simplifications with many assumptions which we know to be
wrong. By testing the power of statistical tests with discrete
individual models under various different study designs, we can
search for the most efficient way to design a study in a way that
is not dependent upon assumptions which we know to be wrong.
Perhaps the most compelling reason to use discrete event models
is that they provide a means of conceptualizing broader theory
than compartmental models can address. That is because they make
the interactions between individuals more explicite. With compartmental
models we can model the relationships of one class of individuals
to another class. But issues like the chance that two sex partners
of an individual both have a common sex partner cannot be well
handled by compartmental models. Likewise whether one has concurrent
partners and whether these in turn have other concurrent partners
is almost impossible to model in the compartmental modeling framework.
Such relationships, however, are easily handled with discrete
individual models. When modeling things like social support and
the risk of chronic disease, one may find that modeling indvidual
networks is more productive than modeling segments of population.
With all of these virtues, one might ask why this class focuses
upon compartmental models rather than models of discrete individuals.
The above listed advantages of individual models are outweighed
for purposes of this class by three things. First, compartmental
models are much easier to construct in a computer. Perhaps in
the future very user friendly programs will be available for discrete
individual models. Progress is being made in this regard. But
currently available programs for compartmental models allow the
student to get a firm footing in dynamic population models while
that is not the case for available discrete individual modeling
programs. Second, once constructed, it is easier to determine
that compartmental models are doing what one intended them to
do. It is quite easy to make mistakes that go undetected such
that one's computer model has some unintended behavior. The programming
for compartmental models is less complex than the programming
for discrete individual models and the patterns generated by unintended
model characteristics need only a single model run while for discrete
individual models they may need hundreds of model runs. Thus model
implementation errors are easier to perceive with compartmental
models. Third, compartmental models can in some instances serve
as a better basis for theoretical abstractions. The fact that
compartmental models set aside much of the details of reality
discussed in the previous paragraphs can be a virtue for the modeling
process. All models set aside parts of reality. That is what makes
them models. Disregarding details often brings things into sharper
focus. Constructing a simpler model first provides a basis for
building a more realistic and more complex model later. Before
building difficult and complex discrete individual models, it
will often pay to clarify key concepts in system behavior by working
with compartmental models.
But it is not necessary to go beyond compartmental models to get
great benefit from epidemiological modeling. Whether or not one
ever learns about discrete individual models, learning how to
construct computer implementations of compartmental models is
a stepping stone in developing one's ability to advance theory
and methods in epidemiology. By constructing computer systems
and then trying to understand their behavior, one's power to think
about systems is advanced. By predicting how a computer system
one has constructed is going to behave and then hitting the run
button and seeing how it actually behaves, one learns how often
simple conceptualization of disease causation can be deceptive
and one gets reinforcement for valuing the development of theoretical
concepts.
The construction and analysis of compartmental models will be
taught in this class using a point and click at the computer screen
approach. A few intuitions about algebra and differential equations
will help in constructing compartmental models. But to build theory
about how epidemiological systems work, we will use mainly images
on a computer screen. The Stella program will be used. Stella
creates difference equations from the points and clicks made on
a screen. Once the initial conditions and parameter values for
such equations are set, Stella progressively solves these equations
across the dimension of time using one of three different processes
which are collectively called numerical integration. Other programs,
such as SAAM II, can also be used in the same way we will use
Stella. SAAM II has the advantage that it can find different
sets of model parameters that make the model output fit observed
patterns. Many other programs can create similar equations but
not using a point and click approach. We use Stella because in
is more intuitive and cheaper.
We will use Stella to construct and numerically solve first order,
ordinary, differential equations describing compartmental models.
These are equations whose flows between compartments are modeled
only using first derivatives relevant to a single dimension. The
dimension we will always use is time. In such models the flows
into and out of a compartment may be dependent upon the state
of any number of other entities. For example the flow of susceptible
population into infected population may depend upon the size of
the infectious, susceptible, and immune population. The fact that
we are using first order, ordinary differential equations means
that changes in flows are not dependent upon how fast the flows
are changing. They are only dependent upon fixed parameter values
or compartment sizes. By limiting ourselves to first order equations
examining only patterns as time changes, we simplify the task
for both the computer and the student. To numerically solve ordinary
differential equations, the computer has no need to keep track
of where it has been in the past. It needs only keep track of
the current state of the compartments whose flows are being modeled.
It need not use lots and lots of memory to keep track of past
states as the case when higher order differential equations are
used.
There is a class of partial differential equations which have
useful characteristics that the ordinary differential equations
we will learn lack. In the ordinary differential equations we
will learn to construct, the flow of population between compartments
is along only a single dimension. The dimension we will always
use is time. By using partial differential equations, the flows
can be in multiple dimensions. In compartmental models it is often
useful to consider flows as they occur simultaneously in the dimensions
of time and of age. Such models can be written as partial differential
equations. But the numerical solution of partial differential
equations where population flows in both the direction of age
and of time requires more computer power because every age must
be taken into account. We will learn to get around this problem
by keeping track of only a few age groups.
The content of this course will be summarized here by discussing
the models which students will construct and/or analyze and what
they are expected to learn from these models.
We start with a single compartment of well individuals which flow
at a constant rate into a compartment of diseased individuals.
This allows us to explore the relationships between risks and
rates and to discuss how Stella models are related to differential
equation models. It also allows us to discuss how "closed
form" models are related to differential equation models.
It is hoped that this will enable the student to read and understand
a wide variety of literature on dynamic models of disease processes
that otherwise would be inaccessible to them.
We then construct some simple models of population dynamics. We
don't model sex or pregnancy as part of the population growth
process. We don't include a detailed and precisely realistic aging
process. We don't have different types of people or different
causes of death for our single type of people. Yet from these
simple models our understanding of what determines population
size is enhanced by better understanding how the logistic equations
behind a birth and death process generate "S" shaped
curves and how population processes come to equilibrium. It also
gives us a chance to discuss how equilibrium values can be determined
both analytically and through simulation. Aditionally it allows
us to distinguish stable equilibriums from unstable equilibriums.
The two by two table is a cornerstone of epidemiological analysis.
But introductory texts treat it only from a static point of view.
Considerable insight into the relationships between different
epidemiological statistics based on the two by two table can be
gained from examining the simplest possible dynamic model relating
exposure and disease. For example, constructing dynamic models
can provide new insights into how odds ratios, risk ratios, and
rate ratios are related over time given either cohort or cross
sectional data. Many students find that they cannot predict these
relationships accurately on the basis of their current understanding.
Yet after this exercize they have a much deeper understanding
that allows them to make predictions about the relationships between
different epidemiological parameters much more readily. The attributable
risk measures commonly used in epidemiology can take on new relevance
when they are examined from a dynamic instead of a static point
of view.
Another issue addressed in this exercize is the difference in
disease patterns generated by new ongoing exposures, old ongoing
exposures, and short term exposure. The dynamic relationships
in a two by two table of course differ considerably according
to whether an exposure has been ongoing for a long time in a population
or has newly arrived in the population. Modeling this dynamically
raises new issues which the student is unlikely to have considered
previously. Similarly modeling and exposure that has a brief duration
vs. one that is ongoing provides new insight into how to use epidemiological
statistics in the process of generating hypotheses about causes
of disease.
The previous models were constituted such that what happened to
any compartment (exposure or disease group) of the population
would happen irregardless of what was happening in other segments
of the population. That means they assumed that the population
behaved as a linear system. But populations experiencing infectious
diseases (and in reality most other diseases as well) do not behave
as linear population systems. In infectious diseases, the dependencies
that derive from contact between different compartments (susceptible
and infectious, for example), are particularly clear. Thus infectious
diseases provide a good framework in which one can learn about
the behavior of non-linear systems.
Three classes of models are developed. Examination of these models
provides new insights regarding thresholds of epidemicity and
endemicity, what parameters are scientifically generalizable and
what parameters are not (such as risk ratios or risk differences),
and what determines the extent of endemic or epidemic transmission.
Before the exercize the student might feel that they have some
understanding of why epidemics come to an end but after the exercize
they will have a much clearer understanding that is likely to
be very different from their initial understanding.
The concept of the basic reproduction number as a fundamental
parameter of transmission can be seen from an examination of dynamic
models of transmission. The dual individual and population interpretations
of this parameter provide insight into how individual and population
effects are different from each other.
Vaccine effects under the traditional risk approach to epidemiological
analysis focus upon the risks experienced by vaccinated and unvaccinated
individuals. The dynamic models underlying traditional measures
of vaccine effect were not made clear until relatively recently.
The inappropriateness of the assumptions in the traditional vaccine
effect measure are made clear by formulating the transmission
model which is consistent with the traditional measure. This process
of model formulation demonstrates that there are more appropriate
measures of vaccine effects upon the susceptibility of individuals.
It also makes it clear that susceptibility effects are not the
only effects which one might consider. For most vaccines, the
circulation of infectious agent will be far less affected by vaccine
effects which prevent infection than they will be by vaccine effects
which decrease transmission from vaccinated individuals who become
infected. A completely different class of vaccine effect measures
is needed to capture these effects. A measure of vaccine effects
upon the basic reproduction number is discussed.
All diseases go through multiple stages which can be treated as
multiple compartments in our models. Increasing the compartments
to refine our treatment of these stages can often provide important
insights for disease prevention. Gunshot wounds may get to a serious
stage much faster after onset than other diseases and we may want
to use only two or three compartments in a gunshot model. These
might be: unshot (+ recovered from shot), shot, and recovered
from shot. But for diseases like cancer, heart disease, or osteoporosis,
it might be quite helpful to have multiple stages (compartments)
of disease rather than just one. The stages might correspond to
diagnosable conditions. But multiple stages might also be used
to generate population distributions of times from onset of a
process to onset of clinical symptoms which are more likely observed
distributions.
When a disease takes a long time to develop, an important issue
is at what stage of the disease are controllable risk factors
acting. The temporal patterns of disease development after exposure
can provide insights into the stage at which exposures are acting.
But many students find that their insights into what patterns
would be expected given actions at different stages are quite
wrong. By exploring the behavior of different models and explaining
why different models have different models, students develop insights
into how patterns of exposures in populations get translated into
patterns of disease which allow them both to make better hypotheses
about the stages at which risk factors act and to make better
predictions about the patterns of disease that will result when
preventive actions are taken.
Diseases that develop slowly can be controlled with the help of
screening programs. An understanding of disease dynamics should
play a crucial role in many decisions about screening program
implementation. Since most epidemiologists don't have the tools
to integrate disease dynamics into their decision processes, they
deal with screening in a wholly static fashion. A static measure
of prevalence is usually used in conjunction with estimates of
sensitivity and specificity of a screening test to evaluate things
like the predictive value of positive or negative tests. This
approach can be highly misleading in different situations. All
prevalence measurements cannot be treated the same. Whenever the
prevalence has been affected by the screening program, there is
considerable risk of misinterpreting the benefit that is deriving
from the screening program. This issue completely escapes most
epidemiologists but a dynamic model presentation usually makes
it quite clear. Moreover the dynamic approach can better optimize
the cost-efficiency of screening programs, the choice of screening
tools, the frequency of rescreening, and the choice of ages for
screening.
Multivariate statistical modeling has become quite popular in
epidemiology. A variety of traditions have arisen for interpreting
and acting upon the results of such analyses. But the inference
issues involved are often misunderstood by epidemiologists. Biostatisticians
have dealt quite thoroughly and elegantly with inference issues
relevant to the statistical target population from which the data
was collected. But epidemiologists usually use the results of
multivariate analysis not for statistical inference, but for scientific
inference to different populations. Such inference requires that
the statistical estimation of joint effects be based upon a causally
valid model of joint effects. Few epidemiologists have thought
deeply about how different causal relationships between variables
will alter their joint effects. This exercizes addresses some
of the more simple causal relationships between multiple causal
variables. While discrete individual models might be needed to
develop a more complete understanding of the issues invovled in
multivariate analyses, this compartmental model approach should
provide an understandable introduction.
Epidemiology 802 Course Home Page.