Epidemiology 606

Transmission System Analysis

Department of Epidemiology

Professor James S. Koopman MD MPH

Infectious agent transmission systems

Infectious agent transmission systems are made up of populations of hosts that get infected, the infectious agents that infect them, everything involving the interactions between agents and hosts that affect the natural history of infection, and everything affecting the interactions of hosts that transmit infection. Transmission systems sustain the circulation of infectious agents and spread them to different parts of the host populations. One way to begin understanding the behavior of transmission systems is to model those systems.

The classic agent-host-environment triad of epidemiology is a simple model of a transmission system (1). Even without much mathematical underpinning, this simple model served to orient the thinking of epidemiologists toward control actions at the vertices in the triad and to their interactions. This model, like the more mathematically based ones we will learn in this course, helps identify issues that might be ignored without the model.

Transmission system models like the epidemiological triad are more useful when expressed mathematically or as computer models. In Epid 605 you learned the concepts of endemic and epidemic thresholds. These concepts were some of the first fruits of analysis of simple mathematical models of transmission systems. Simple mathematical transmission system models have been useful in solidifying other concepts that you learned in 605 as well. They have demonstrated that the effects of an exposure on infection risk in a population depend upon the state of the transmission system. They have also demonstrated that the basic reproduction number, R0, is useful for characterizing different transmission systems. Epidemiology 606 will strengthen your understanding of these and related concepts by giving you hands on experience working with transmission system models.

One reason we need to understand transmission systems is that manipulating these systems is the basis of infection control. When one seeks to control the spread of infection, one is seeking to alter a transmission system. That reality cannot be escaped.

 

Controlling transmission systems without considering them

The fact that all infection control involves manipulating transmission systems does not mean that one must understand the functioning of transmission systems in order to control infection. Indeed, the dominant approach of infectious disease science in search of ways to control transmission has been to study details about some narrow aspect of the transmission system without seeking any theory or understanding of how the details fit into the transmission system. For example, the genomic composition of infectious agents and details of their biology are sought in the hope that a therapeutic agent can be identified which will bring the functioning of the transmission system to a halt. This search proceeds without any guidance from transmission system analysis as to what aspects of agent biology are most crucial for maintaining the circulation of an organism. Whether spread of infection would be minimized by interrupting uptake of organisms, maximal excretion levels of organisms, duration of excretion of organisms, or survival of excreted organisms are questions that can only be addressed through transmission system analyses. But the search for therapeutics can be productively pursued without considering the transmission system.

It is also possible to pursue other control modalities productively without analyzing the transmission system affected by those modalities. Usually we seek modes of transmission and risk factors for transmission without much guidance from transmission system analysis. The hope is that once ways to interrupt key modes of transmission have been identified, the transmission system can be controlled without having to understand how those transmission modes fit into the overall transmission system. Epidemiologists contribute greatly to human well being through risk factor investigations that are in no way influenced by transmission system analysis. The question is, how much more productive could their investigations be if they were guided by transmission system analysis.

Just like risk factor identification and control, vaccine development is pursued without the guidance that a more formal understanding of the transmission system might provide. Experience shows this can lead to effective control of transmission. It did not, after all, take a detailed understanding of measles transmission dynamics to develop an effective measles vaccine. To eradicate measles, on the other hand, may require a better understanding of the measles transmission system.

 

Transmission system analysis can orient our search for infection control methods

Finding an Achilles heal for an infection without considering its transmission system is a matter of luck that cannot be counted upon. We never completely disregard transmission systems. Some understanding of them is required even to speculate that some treatment or vaccine or sanitation measure or education program could lead to effective control. But a more formal consideration of transmission systems may be essential to achieve some infection control objectives. Often when a tool is developed that effectively attacks a key aspect of the transmission system, such as the smallpox vaccine or the polio vaccine, an adequate understanding the transmission system is crucial to devising an effective strategy for using that tool.

In the case of smallpox and polio, differences in smallpox and polio transmission systems dictate different strategies for using vaccines. Polio vaccines are administered to disseminate them as widely as possible. Smallpox vaccine was focused as narrowly and as intensely as possible. The strategies for using these vaccines were based on analyses of their transmission systems. These vaccine administration strategies contributed as much to eradication as the development of the vaccines themselves.

The need for an understanding of transmission systems to guide control efforts is particularly evident in the case of HIV. No magic bullet for HIV is on the horizon. The best hope for containment is to efficiently focus and combine treatment, vaccination, education, and behavioral skills training. The same could be said for HPV, CMV, HBV, rotaviruses, a wide variety of bacterial agents causing nosocomial infections, enterotoxigenic E. coli, gonorrhea, syphilis, chalmydia, Streptococcus pneumonia, malaria, cryptosporidia, Candida, and many other viruses, bacteria, parasites and fungi. The search for simple Achilles heals should not be abandoned. But a more integrative understanding of where in the transmission system to search would be helpful. Likewise, a more integrative understanding of how to effectively organize control programs that efficiently combine various control activities is needed.

An integrative understanding is needed of how the different elements of the transmission system come together to determine the behavior of that transmission system. We should understand how the arrangement of transmission system elements affects the ability of the system to sustain circulation the infectious agent, to generate epidemics, and to generate or sustain the diversity of the agent. Agent genes, host genes, transmission modes, or environmental contamination don't by themselves circulate infectious agents and generate diversity in them. The transmission system performs these functions by relating these elements to each other. To control the infections causing so much misery and death in the world and to prevent the emergence or re-emergence of new plagues, a key step is to understand how transmission systems work.

 

Transmission system analysis is just emerging as a science

The construction and analysis of mathematical models of the transmission of infectious agents in populations is a discipline with a century old tradition that has generated an exponentially growing body of literature. But much of that literature is mathematicians talking to other mathematicians and addressing issues that are often not relevant to a real world science of transmission systems. It is only rarely in this literature that we encounter scientists struggling with the process of discovering the determinants of patterns of infection or explaining those patterns. The mathematicians seem to be preoccupied with proving that the theoretical systems they examine have certain properties like global stability. Global stability means that no matter what initial conditions a system starts at, it will eventually return to its stable point. Somehow, the mathematicians seem to feel that when they have performed a logical proof of something like global stability they have performed an elegant analysis. Such proofs are possible only on quite simple systems. A science of transmission systems needs to relate the complexities of real world contact patterns to the complexities infection patterns. To pursue what the mathematicians see as elegant, they avoid these complexities.

Mathematics is powerful. It can and has provided great insights into transmission system behavior. Epidemiologists can benefit greatly from relating to mathematicians in seeking to understand transmission systems. But epidemiologists, not mathematicians, must lead a science of transmission systems. Modern computer tools allow epidemiologists to use the powerful insights of mathematicians more readily in a science of transmission systems. But only a handful of epidemiologists are pursuing a science that explores theories of transmission systems and relates them to real world observations.

The pursuit of models with enough conceptual simplicity to yield to mathematical analysis has inhibited the development of a science of transmission systems. The art of modeling begins with very gross abstractions yielding simple models and develop a plan for elaborating the models. Mathematicians pursue an elaboration plan that will yield to their type of analysis. What a science of transmission systems needs is for epidemiologists to pursue an elaboration plan that enhances the use of data to determine whether a theoretical model of a transmission system captures an essential aspect of the system.

Epidemiologists need to elaborate models in ways that mathematicians are unlikely to consider. They need to elaborate them to capture more of what is known about the immune response and its effects. We can now measure many different aspects of the immune response and thus such elaboration of models can increase the potential relationships of models to data. Epidemiologists need to elaborate models so that they capture more measurable aspects of contact patterns and risk factor exposures. Mathematicians can't do this because they are just unaware of the key issues. It is my hope that in this course you will see that you have the ability to advance a science of transmission systems.

Some model simplifications with clear mathematical virtue lead us away from a science of transmission systems. A case in point is parameter reduction. An objective of mathematicians is often to minimize the parameters in their models. If a mathematical relationship between parameters in a model can be found such that all relationships defined by a larger number of parameters can be captured by a smaller number of parameters, the virtuous model from the mathematical point of view is the model with fewer parameters. But such simplification can turn a model into a tautology (2). That is to say, it can eliminate the possibility of refuting the model with data.

There are two ways to estimate the parameters of transmission system models. The first way is to tie them down with real world data relating to the specific rates or phenomena that those parameters reflect. For example, we can observe the rates at which people expose themselves to risk factors; we can observe contact rates; or we can measure how long individuals excrete infectious organisms. The second way is to manipulate the value of the parameters in the model until the model produces an overall pattern of infection that we have observed. If parameters can't be tied down with data, their values may sometimes be manipulated so that they could fit any real world observations. This characteristic makes the models irrefutable and quite often unhelpful to science. Splitting parameters in models so that some parameters can be directly tied down is often needed in a science of transmission systems.

Another strategy of model elaboration can enhance the estimation of the remaining parameters by jiggling them in a model until they generate observed infection patterns. That strategy is to elaborate the models so that they generate patterns of infection with more detail. Splitting categories of individuals in the model can accomplish this objective. If the more detailed patterns are observable, then the models have been elaborated in ways that make them more scientifically useful. In this regard, elaboration of the social structure and observable human interactions in models may be required if a science of transmission system analysis is to advance (3).

We deal with issues related to efficient parameter choice right in the first class exercise. There we discuss two different ways to formulate contact and transmission parameters. You will see that I choose to use two parameters where I could have used one. That is because I see a path for the development of transmission system science using the two parameter formulation but not the one parameter formulation. Even though the product of the contact rate times the transmission probability can be combined into a single parameter reflecting "effective contact rate", I choose separate contact rate and transmission probability parameters because different types of people will have different contact rates which I might be able to measure. Also, transmission probabilities may differ for different classes of people. If I elaborate my model to include these different classes of people, I may make it more relevant to observable data.

The right balance between simplicity and complexity cannot be defined by the needs of a single model analysis. That is because no model represents the goal or the endpoint we are after. No model represents the true and fundamental theorem on which all subsequent science in the field must rest. We have to view each model as to how it contributes to a line of model development. Epidemiologists who can relate to mathematicians need to define that line of development. In this class I hope you develop your own philosophy as to how that line of development should be pursued.

 

Transmission system analysis needs to be integrated with other sciences

Studying the molecular and cellular biology of infectious agents and their interactions with hosts can be productive without any help from transmission system analysis. Transmission system analyses can be productive without any help from molecular and cellular biology. But the real potential of the coming decades is for molecular and cellular investigations at the agent and host level to proceed hand in hand with analysis of transmission systems. Molecular biology provides tools that transmission system analysis can use. Transmission system analysis, in return, provides an analytical framework for investigating the human infectious agents behave which is needed because the investigation of that behavior must be through observational studies on individuals in transmission systems.

A major contribution of modern molecular biology to transmission system analyses is the identification of specific agent variants and specific immune responses to those variants. Infection data specified by molecular characteristics allows deductions as to who has transmitted to whom. But the potential of "molecular infectious disease epidemiology" goes far beyond just helping with such deductions. Many agents evolve changes rapidly enough so that phylogenetic distances determined from genome sequences might be useful for reflecting transmission system distances. If this proves to be the case, the database on which a science of transmission systems can be constructed will grow rapidly.

Transmission system analysis needs alliances with molecular biology for another reason. That is that infectious agents evolve to escape our control measures. Infectious agents have devised many ways to exchange genomic material in ways that help maintain their circulation. Consequently, they can often find ways to change that will keep them circulating when they faced with simple attacks. We need to understand how we can manipulate the transmission system to minimize the chances that the agent can adapt to escape our control measures. Stopping a specific variant of an agent is a simple issue compared to stopping the evolution of that agent in ways that allow it to escape control measures. A more integrative sort of understanding is needed for this latter task.

A further reason for alliance between microbiological and transmission system science is to study the emergence of new infections. New agents emerge to utilize transmission system elements that we create by using technological advances that bring people together or by pursuing political paths that impose difficulties on people with the potential to form core transmission units. Transmission system science needs to define what conformations of transmission systems will promote emergence of new agents and what can be done to inhibit such emergence.

 

Learning to analyze transmission systems

The previous section sought to motivate a desire to learn how to analyze transmission systems. I will teach in 606 some of the rudiments of such analysis. This course introduces transmission system analysis in a relatively non-mathematical fashion. First, we discuss some of the issues in infectious disease control that can be better addressed using transmission system analysis.

Next, we learn to use a computer program that allows us to point at a screen and click to construct and analyze transmission systems. The simple approach covered in this text is called deterministic compartmental modeling. This style of transmission system modeling cannot address many realistic aspects of transmission systems. Other methods of transmission system analysis are often needed. But the simple approach we take here helps us appreciate the nature of transmission system phenomena and how these determine infection levels. It motivates the use of models to advance the science of infectious disease control. It helps us recognize the situations where standard epidemiological methods are inappropriate or deceptive. It provides a basis on which to begin understanding the mathematics of infectious disease spread through populations. And it provides a framework for thinking about administrative and policy issues affecting the control of infectious diseases.

In getting you started on the process of learning transmission system analysis, I have opted for a path that may be too abstract for you to always see how the concepts and skills being learned might be applicable to practical infection control issues. I ask your tolerance and I ask you to keep faith that the goal is there to be found. Teaching transmission system analysis using compartmental system models in 10 easy lessons has never been attempted before anywhere that I am aware of. Please send me e-mail messages with teaching suggestions so that those who follow you may be better off.

Transmission systems compared to non-infectious disease causal systems

Transmission systems have important differences from the causal systems generating non-infectious diseases. The first and major difference is that each causal action in the system generates new infectious agent that increases the risks associated with other causes. This positive feedback loop at a population level makes risk factors behave very differently for infectious as compared to non-infectious disease. Second, infected individuals acquire immunity. This negative feedback loop occurs both at an individual and a population level. Just like transmission, it causes a particular population pattern of infectious disease risk factors to generate very different population patterns of infections than similar population patterns of non-infectious disease risk factors would generate. Controlling infectious disease transmission requires working within or upon these and related feedback loops.

The methods that epidemiologists use to discover non-infectious disease causes and measure their effects assume that no feedback loops like those related to transmission and immunity exists at the population level. The next section deals with these assumptions.

 

The Assumptions of Standard Epidemiological Methods

Common epidemiological methods, such as "effect estimation" or "attributable risk calculation" or "control for confounding procedures" make hidden assumptions about how causal actions are organized. These assumptions, and the robustness of epidemiological methods to inconsistencies with them, are rarely examined or questioned by epidemiologists. Until the most recent version of Modern Epidemiology by Rothman and Greenland (4), the standard epidemiology texts did not recognize the fact that several of these assumptions were inconsistent with the nature of infectious diseases.

Standard epidemiological methods, as currently taught in almost all master's level programs and most doctoral programs, make the following assumptions:

  1. Because the web of disease causation ultimately affects individuals, all causes ultimately act by affecting individuals.
  2. Risk that arises from the interaction of individuals can be treated the same way as risk arising from an individual's genes or their environment. That is to say that the arrangement of interactions between individuals can be ignored in epidemiological analysis. Observing the experiences of individuals is sufficient to assess causal effects mediated by interactions between individuals.

Assumption one is broader than assumption two. It takes a level of explanation to distinguish these two assumptions which I do not think is necessary for this course. For our purposes we can concentrate mainly on number 2. Our models will in fact be consistent with assumption 1. We will only examine causes acting upon individuals whose action is mediated or modified by patterns of interactions between individuals. Since our models will deal with causes whose action is mediated by patterns of interaction between individuals, they will be inconsistent with assumption 2. That means that our models will deal with phenomena that are inconsistent with the assumptions of analytic methods you have learned in biostat 553, 523, 560 and 510.

 

The Utility of The Assumptions of Standard Epidemiological Analysis

The models discussed in the first four chapters of the Epid 802 text are consistent with both these assumptions. Most situations where epidemiologists use their methods, however, are inconsistent with them. But so what? Reality must be distorted to make the simplifications that advance understanding and lead to progress. Simplification helps to integrate knowledge into our thinking and actions in ways that can lead to the prevention of infection. For example, the simplification of analytic models by making assumption 2 has served epidemiology well in its quest to find controllable causes of disease. This assumption enables the analysis of 2 by 2 tables, continuous variable relationships, multivariable data sets, and multilevel data sets. All of these analytic methods fall apart if we relax assumption 2. If epidemiologists had not been willing to view infinitely complex reality from the simplified point of view of a 2 by 2 table of exposure versus disease, most likely many important causes of disease that were discovered and controlled would have been left undiscovered.

The two assumptions listed above enable epidemiologists to avoid the complexities of systems analyses. This can certainly be a virtue in that simplicity leads to action. These assumptions make causal systems so simple that analysis of the causal systems based on them is trivial. They make causal systems linear. That means they make effects at the population level simple sums of effects at the individual level. This enables epidemiologists to avoid much of the complex systems analyses that characterize the practice of physics, chemistry, geology, ecology, neurophysiology, and many other sciences. But it also makes it harder for epidemiologists to appreciate the value of the systems analysis methods developed by these sciences and by operations engineers. Moreover, making assumptions that we clearly recognize as being wrong can limit our progress.

 

The Limitations of These Assumptions

While these assumptions have facilitated our quest to find and control some causes of disease, they put limitations on our capabilities to find and control other causes.

 

Obscuring truly population level causes

For example, these assumptions cause us to ignore causes intrinsic to social, economic, political, or administrative organization. They obscure effects of interactions between individuals. Patterns of interaction between individuals that are several generations of contact distant from subjects in an epidemiological study may have only a small effect on the individual in a study. But the product of all these small interactions can be the dominant influence on what level of infection a population may experience. Thus, even when they do not dominate individual risk assessment, contact patterns may offer the key to controlling the transmission system.

 

Distorting the measurement of causal effects

In some cases, patterns of interaction might be very directly important to individual risk assessment. It has been found, for example, that characteristics of the sexual partners of one's major sexual partner are stronger determinants of STD risk than what risk behaviors one practices with their major sexual partner. The characteristics of the partners of one's major sexual partner can be entered as risk factors in a standard epidemiological analysis. But the very nature of these risk factors violates assumptions of standard analytical methods.

Two articles worthy of your study (4,5) demonstrate that standard epidemiological methods making assumption 2 can lead to erroneous conclusions with serious consequences. In this course we will develop some of the models used in these articles. While we will study models that demonstrate the problems with the assumptions of standard methods, we will not present alternative data analysis methods with greater validity. In this course we will not discuss how to use transmission models for analyzing data to estimate risk factor effects. These methods go beyond the scope of Epidemiology 606 or even 802.

 

Obscuring individual effects counteracted by immunity

Immunity often causes standard epidemiological effects to fail in the detection of risk factors that increase exposure to infectious agents. Unless immunity can be measured, it creates a negative association at the individual level while the exposure in susceptible individuals creates a positive association. The same individuals who have high exposure also have high immunity. The negative association balances out the positive association and obscures causes that act very strongly at the individual level. These balancing forces affect both acute and chronic infectious diseases. They make it especially difficult to detect the role of exposures transmitting infections that contribute to the burden of chronic diseases like heart disease, arthritis, and cancer. This has undoubtedly led to an underestimation of the contributions that infections make to the processes generating chronic diseases.

The problem of unmeasured immunity wiping out associations with risk factors can be avoided by analyzing the temporal pattern of disease within a pattern of connection between individuals using a model that incorporates immunity. In other words, using a transmission system model provides a way around this problem. Measuring immunity precisely in a prospective manner provides another way around this problem. But that does not eliminate the need for transmission system analysis. Just to discover what role different immune responses play in the infection process, we need to analyze observational data using methods that do not make assumption 2 listed above. This assumption can be particularly distorting when it comes to measuring the contribution that different immune response mechanisms make to prevention and control of infection in the individual.

New Population Causes Emerge from Interactions Between Individuals

In models where individuals interact, it is possible for phenomenon to emerge at the population level that cannot be observed by just recording individual level events. For example, the pattern in which individuals get connected to each other is population phenomenon. Consider two different patterns where each individual has two contacts. These are illustrated in figure 1. In the first, the two contacts arrange individuals into triads so that each individual is connected to only two others. In the second, each individual is connected to two individuals who are connected to a different individual in a pattern that provides a path of connection between all pairs of individuals in the population. Just counting people's connections or exposures in this case is insufficient to describe the population pattern of connection. The individuals are part of a population system. The behavior of that system depends upon the pattern of connection between individuals.

Figure 1

Two populations of twelve people where each individual has two contacts

Using the continuous compartmental models of deterministic transmission systems that we study in this course, we are limited in the models of contact patterns that we can construct. We can construct models that will give us important insights. But we cannot construct models with distinct patterns of connection between individuals as seen in figure 1. That is because continuous models do not have any individuals in them. They only have continuous units of population. Discrete individual models can generate contact patterns like those seen in figure 1. The models we use provide a better focus on relaxing assumption 2 than assumption 1.

To address important causes of infectious diseases that are subject to our control we must not accept assumption 2. We must analyze the systems that affect the circulation of infectious agents in populations. And we must determine what actions will lead to the most effective control of these transmission systems. To those ends, we now elaborate on what we mean by a transmission system.

Systems in Infectious Disease Epidemiology

What distinguishes a system from a "heap" is that the arrangement of the elements makes a difference for the output. A heap of sand is not a system. That fact is confirmed by observing that switching the places of the grains of sand does not change anything. The parts of a radio, on the other hand, must be precisely connected to each other in a specific arrangement for the radio to convert radio waves into sound. A heap of radio parts just won't do the job.

In most epidemiological data analyses, we have a heap of data from cases. The analyst usually assumes that the individuals on whom data has been gathered are not connected together in a system. This assumption is made evident by observing that the results of analysis using standard epidemiological methods are the same no matter how the individuals on whom data is collected are arranged in the analysis file.

For data relevant to infectious diseases, the reality is that individuals are connected into a transmission system. Even if individuals are separated by several generations of contact, the relative position of individuals in the transmission system can create strong dependence between them in the outcomes of their exposures. A risk behavior will only transmit infection if the chain of transmission has brought forward an infectious agent that the risk behavior can transmit. When infection is transmitted in one part of a population, it often has consequences for transmission in other parts of the population that might be connected by rather diffuse chains. Subtle arrangements of contacts that amplify transmission can make the level of agent in one segment of a population quite dependent upon levels of agent in quite distant segments of the population.

The dependence between individuals generated by transmission systems influences the results of epidemiological data analysis. We have said, however, that this course will not address these data analytic issues. Instead of data analysis, we will learn systems analysis. Hopefully, some time in the future, you might learn how to link data analysis with systems analysis. In this course, however, we will focus on theoretical systems rather than observational data. We will focus on particular aspects of transmission system behavior. These include

  1. whether infection is sustained or dies out in different populations,
  2. whether epidemics occur,
  3. how fast epidemics rise,
  4. how long epidemics last,
  5. how infection is distributed in different risk groups,
  6. how infection levels respond to control measures like treatment, vaccination, and sanitation,
  7. how diversity of infectious agents is stimulated or maintained, and
  8. how control measures affect the diversity of the infectious agents and their ability to escape control measures by changing their susceptibility to them.

Our analyses will teach us what to expect from a system that has a specified conformation. It will show how system behavior changes as we change parameter values in the system. It will show us how system behavior changes as the conformation of the system changes. This type of systems analysis will help us define where we should look for controllable causes that spread infection through a population. It will help us define the conditions where different control actions will or will not lower the level of infection in a population. It will help focus us on the arrangement of contacts as a causal factor whose control deserves consideration. That is to say, it will provide a new focus on social factors as causes of infection. It will help us make better predictions of control program effects in conditions where standard epidemiological methods cause serious errors.

Conceptualizing Transmission Systems

Before analyzing transmission systems, it will help to specify what a transmission system is. We should specify the entities and events that compose the system and those that are outside influences affecting that system.

System Outputs

Transmission system analysis can be used to maximize or minimize the output of the transmission system, depending upon whether the objective is efficient germ warfare or efficient public health action. Accordingly, population patterns of infection are the outcomes of interest that the transmission system is generating. We might be interested in any aspect of infection patterns. We could analyze how the system affects the shape of epidemic curves, the variability of endemic infection levels, the age pattern of infection, or the average number of transmissions from each infected individual. We could examine more compound outputs like how the age pattern changes over time, or how the average number of transmissions from infected individuals relates to the endemic infection level. We could examine outputs relating to the agent population such as how variants of the agent are distributed among different parts of the population or how frequently variants of the agent with characteristics like antibiotic resistance arise. The system outputs we examine in a transmission system analysis will depend upon the specific issues we want our transmission system analysis to resolve.

One issue addressed in the first exercise is what determines the prevalence of infection in a population. Such a simple output can be examined with simple models. As we pursue an understanding of more detailed patterns of infection, we will need more complex models. To keep things simple, we will ignore many important determinants of infection. For example, in this chapter we will completely ignore temporal patterns of infection by age. We greatly simplify our models by disregarding age. But to pursue many questions about the impact of control programs, age must be an important part of the transmission system models that we analyze. Therefore, we provide some indication of how our approach can be elaborated to include age.

Because of the simplifications we make in our models, our analysis of transmission systems will always be tentative. It will always be possible that inclusion of some ignored complexity could change the conclusions we have drawn from our analysis. The possibility will always exist that further elaboration of some aspect of the system model which we have simplified could provide a new key to infection control. That is the way science works. Any understanding reached through scientific research can be deepened and elaborated by refining details. It is quite possible at any time that refinements will generate new understanding that can change prior conclusions.

An example of how model elaboration can change the conclusion of a systems analysis is found in the work of a group of us at Michigan on HIV transmission systems. Most analyses of HIV transmission systems conducted early in the epidemic did not differentiate the contagiousness of infection by stage of infection. The models used early in the epidemic, however, were inconsistent with the temporal pattern of infections that were being observed. Models without time dependent contagiousness of infection rose too slowly or they rose too far to be consistent with the patterns of infection that were being observed in the U.S. gay male community. Both the public health and gay male communities wanted to believe that the explanation for why epidemic rates stopped rising was that behavior change in the gay male community caused the falloff from a continued epidemic rise.

We found, however, that the observed level of behavior change could not affect transmission nearly as much in our models as was being observed in the real world. Therefore, we sought to see whether the dropoff could be explained by variation in the contagiousness of infection by stage of infection. A preliminary analysis of a system with different stages of infection having different transmission probabilities led us to the conclusion that there might be huge differences in transmission probabilities between early and middle stages of HIV infection (18). But our models did not have behavior change as one progressed through different life stages or more transient behavior change as one's social or psychological life changed. They only had behavior change associated with increasing awareness of the epidemic. When we added these very likely aspects of reality to our models along with more realistic contact patterns by age groups and risk groups, we found that these elements could very dramatically augment the effects of higher contagiousness during the early stages of infection. This meant that very much lower differences in transmission probabilities between early and middle stage infection could explain observed infection patterns (30). In this case, the simple model gave a distorted view of what the most likely natural history of contagiousness was.

Given that complicated models always have the potential to change the conclusions reached after analysis of simple models, let us justify our choice of simple outcomes like general population incidence and prevalence. We commented earlier on the virtues of simplicity. An understanding of natural system behavior and organization can only be gained by conceptualizing the system as something simpler than it really is. Natural biological and social systems are infinitely complex. The patterns of human interaction are so rich that their details are practically innumerable. Likewise, the way infectious agents interact with humans and their environment can lead to endless complexity. Simplification is essential if we are to find some leverage to generate new knowledge.

Occams razor is the principle that the simplest model that can explain the data will be the most informative and useful. But simplification should not be taken as a goal in itself. Understanding that helps in the control of disease is a worthier goal of transmission system analysis. But to attain that goal, it may be necessary to understand complexity and not just simplifications. Recent theories of the emergence of life and biology provide strong arguments that such emergence is a natural characteristic of complexity (7). If we eliminate complexity from our models, we might eliminate the essence of biology and epidemiological causation out of our models. A current saying is that in biology, Occams razor provides a means to slit your throat.

Biological science has clearly recognized the value of working out the details of complex reality. Detailed molecular and structural analyses are the cornerstone of biology's search for controllable causes of disease. Magnificent molecular and physical tools have been developed for the task. Molecular biology has built a magnificent scaffolding that provides new ways to discover intricate details and then gives those details useful meaning.

Epidemiological science is behind this growth curve for biological science. Epidemiology is still elaborating simple relationships rather than detailing the complexity through which nature works. Epidemiology has only very recently begun to construct a framework that defines which details of population infection patterns reflect key aspects of the complex systems that transmit infections through populations. That framework consists of transmission system models that incorporate biological and sociological theories. It is my view that the essential scaffolding of our science must be constructed by considering the population and ecological systems in which disease causes act. We have been too ready to accept the use of statistical models that are inconsistent with what we know about those systems. We have been too willing to let quantitative theories about sampling effects guide our choice of models and too reticent to develop general causal theories.

But the construction of useful theories of transmission systems can begin with very simple models as long as we keep in mind the limitations of those models and use them only for the ends they are capable of serving. When using simple models of transmission systems, we should always view them as stepping-stones to models that are more useful. We should always hold our conclusions from their analysis as being conditioned by the simplifying assumptions of our models. But if we do not begin by constructing and analyzing the very simple models, we will not acquire the insights we need to proceed to analyses that are more powerful.

The first step in constructing a scaffold for transmission system analysis is to define the elements of the system models we will use for the task. Pursuing the simplest outcome, namely infection prevalence in the overall population, will help define a system for analysis that has the simplest and most understandable elements. We will now do that.

Transmission System Elements

Some basic elements of interest in a transmission system are:

  1. the human populations that get infected and that develop immunity to infection
  2. the processes which generate direct or indirect contacts between humans that can transmit infection
  3. the factors that affect transmission risks given a contact
  4. the growth dynamics of infectious agents in different environments
  5. the evolved diversity of agents spreading through the population and generating cross reactive immune responses to all related agents
  6. the natural history of infection and immune response in infected individuals
  7. the factors that affect that natural history
  8. the environmental landscape within which populations and agents grow and interact

This list could be made shorter or longer depending on our purposes. Only elements 1, 2, 6, and 8 are needed for the simple transmission systems we will examine in the first exercise. In that exercise we add a minimum of complexity to factors 1 and 6 and we make elements 2 and 8 so simple that they might be disregarded.

The only process for 2 that we deal with is a random process. We do that to simplify our first conceptualizations of transmission systems. We make this simplification recognizing that later we will need to elaborate a more realistic contact process. We recognize that simplifying contact to a random process can greatly diminish the utility of transmission system models for epidemiology. That conclusion derives from the observation that some of the most controllable causes of infection spread can be found by discovering the non-random determinants of contact patterns. The key to infection control might involve defining the processes that generate sustainable chains of transmission in small segments of larger populations or through connections between diverse population segments. Thus, we want to make contact processes central to our thinking despite the fact that we proceed by first developing models that trivialize this process.

Likewise, element 8 will be presented only in an over simplified manner. All the models we present will have a homogenous underlying structure. The continuous compartmental models of transmission systems that we will formulate make it seem that environmental structure is not even an element of our model systems. In the real world, however, the geographic and social space affecting the contacts through which infection spreads is never homogeneous. We do not deal with heterogeneous space because to do that, models of discrete individuals rather than compartmental models are most useful. Discrete individual models of transmission systems are also needed to help design epidemiological studies. Discrete individual models take longer to learn and are more complex to simulate and interpret than are compartmental models. Because our first task is to develop a framework on which general principles of transmission systems and not unique particularities can be understood, we stick with computer simulations of continuous compartmental models of populations in this book. That means that to be practical, we stick to homogenous space.

We include elements 3 and 7 in our list to better relate to the traditional goal of epidemiology to discover controllable risk factors leading to disease. In the second exercise, we will add these elements to our models. We do not want to give the impression that risk factors are an ultimate focus for disease control. Indeed, we believe that an extremely important benefit of transmission system analysis can be to expand the epidemiologist's view of causal factors beyond those that can be represented by risk factors. But many presentations of transmission models have not included the traditional sort of risk factors that are the focus of so much epidemiological investigation. This might make it harder for some epidemiologists to see the relevance of transmission system analysis.

Element 5 is rarely included as an element of transmission system models. It is my hunch that agent diversity will wind up being a key element of many transmission systems. For infections like HIV, our molecular powers to describe agent diversity and analyze which isolates are more related to each other may give us a whole new basis to use molecular epidemiology information to describe transmission systems. For infections like malaria, gonorrhea, Pneumococcus, E. coli UTI, and a whole host of others, immune cross reactions between related but different organisms may define transmission systems which behave very differently than transmission systems with a single agent.

All of the listed elements could be elaborated to an infinite degree of complexity. Populations, for example, might be broken into enough categories so that each individual in any real population would fall into a different category. They might be specified by age, by geographic location, by social class, etc. The contact processes might include a great variety of conditions that affect whether two individuals will come into contact or not. They might include a chance element or they might not.

The choice of a continuous populations rather than a discrete individuals

We don't want our support of the choice of a continuous compartmental modeling framework for this course to indicate that we think compartmental models are always the first transmission models that epidemiologists should learn. In fact, we believe that discrete individual models are sometimes better at engaging epidemiologists in systems thinking. This is especially the case for epidemiologists whose primary focus is on data analysis rather than on understanding the general principals of causal theory. Discrete individual models can generate the kind of data which epidemiologists are used to working with. The compartmental models we present in this course cannot. But continuous compartmental models more readily demonstrate the system phenomenon of interest.

It is many times desirable to treat individuals rather than populations as the elements of transmission systems. We have already mentioned this with regard to designing epidemiological studies and exploring the significance of geographic and social space. The software used in this course does not make it easy to use individuals as model elements. But treating populations as having continuous values in which individuals cannot be treated separately is a reasonable approach when our purpose does not include designing an epidemiological study. We will divide populations into continuous compartments in the model systems we will analyze. Between any two fractional values of populations, there may always be an additional fractional value. This continuous nature of the populations we will model means that we will ignore events that can be unique to individuals. For example, we will ignore the role of chance in individual events. Thus, the types of models of transmission systems that we will analyze are deterministic (no role of chance), continuous (the population can always be further subdivided), compartmental models. Compartmental models are ones where the basic unit that flows in the model stays the same. We will be modeling the flows of units of population from one state to another. The fact that the same unit of measurement is used in any compartment in our models makes them compartmental models.

The Non-linear Nature of Transmission System Dynamics

A key to understanding how analysis of infectious disease causes must differ from the analysis of non-infectious disease causes lies in understanding the difference between linear and non-linear systems. We are not talking here about linear or non-linear relationships between exposures and outcomes. That issue is relevant to the analysis of a static set of data. Dynamically linear systems can generate either linear or non-linear relationships between variables. Non-linear dynamics, however, imply the potential for a whole set of phenomenon that cannot arise in linear systems. These include chaos, the generation of fractal patterns, and extreme sensitivity of system behavior to initial conditions. We, however, will only examine non-linear transmission systems with model forms and parameter spaces that do not generate chaotic behavior.

To make a simplified definition of linear vs. non-linear for our purposes, we will use the Stella IIÔ modeling framework. For Epid 606 students, the boxes in this framework will represent segments of the population. The arrows with width between boxes will represent the flow of population from one segment to another. The valves on these arrows are called flow regulators. We connect the determinants of a flow to these valves.

In the simplest case of linear dynamics, we have only two boxes representing well and diseased segments of the population. There is a flow only from the well to the diseased. Once one is defined as being diseased, we assume there is no way they can get back into the well segment of the population. We make the rate of disease development constant. The Stella IIÔ figure for this simple linear disease development process is as follows:

This system is excessively simple. There are no feedbacks. No population enters or leaves the system. And there is only one flow. Note that the flow from the well to the diseased depends only upon the number of well that are available to develop disease and the rate at which they develop disease. The flow per unit time that we put in the flow regulator for the NewCaseFlow is just the quantity of Well times the DiseaseRate. The fact that all flows in our model system diagram depend directly upon the quantity of population in the compartment (Stella stock) out of which there is a flow means that our system is linear.

The differential equations for our system are

[1]

Another way to confirm dynamical linearity is to observe that the exponent in the product of all compartment terms (Stella IIÔ stocks representing population segments in our models) in all terms of the differential equations equals 1.

Suppose we start out with 100,000 Well and that the disease rate is 0.01 per well person per day. Then over time we will observe the following numbers of individuals in the Well and Diseased compartments:

Now consider a model of an infection transmission system. Here we will classify the segments of our population as infected and uninfected. The more infected individuals there are around, the greater the number of opportunities for the uninfected individuals to get infected. To make this model as simple as possible, we assume that each individual in the population contacts each other individual in the population at the same rate. That means that each infected individual has the same chance each day of infecting each of the uninfected individuals. This assumption is the same as the assumption of random mixing. The Stella diagram for this model is as follows:

The flow from the uninfected to the infected in this model proceeds at the rate of {Uninfected * EffectiveContactRate * Infected}. (I pointed out earlier that the single "EffectiveContactRate" parameter needs to be divided up to advance science. But for our purposes here, we stick to a single parameter.) Each infected individual contacts each uninfected individual at the effective contact rate. Since the contacts in the effective contact are defined as contacts that are sufficient to transmit infection, each infected individual infects the number of susceptible individuals times the effective contact rate. You may worry about two different infected individuals trying to infect the same individual. But remember that there are really no individuals in this model. We avoid the problem of double infection by taking such very short time steps that during any time step only very tiny segments of population are newly infected and the rate of duplicate infection is so low that it makes no difference to the calculations.

In our population of 100,000 where 99,999 of the population units are uninfected and one is infected, the following pattern of infections occurs given that each infected has a one in a million chance of contacting each uninfected individual each day.

At first there are very few infections. Since there are nearly 100,000 susceptible units and the infectious unit contacts these susceptible units at the rate of 1/1,000,000, one unit becomes infected every 10 days. But that unit generates new units and those units in turn generate new units, and keep on generating new units since there is neither death nor recovery from infection in this model. Finally, nearly every part of every unit has become infected. The shape of infection development is quite different from the shape of non-infectious disease development. It is sigmoid rather than negative exponential.

The differential equations describing this system are as follows:

[2]

Note that in the terms of these equations, "Uninfected" is a compartment and "Infected" is a compartment. Each has an exponent of one and multiplied together, the sum of exponents is two. This meets the mathematical definition of nonlinear dynamics. When this criteria is met, there is no way that the system can be reformulated in an equivalent manner such that what happens to a compartment is dependent only upon the state of that compartment. Thus, non-linear by this criteria means that what happens to some population units is affected by what has happened to others. Note that this criteria does not mean that compartment values over time generate straight lines.

This difference has profound implications for the way that we must analyze infectious disease. Many statisticians have hoped (or assumed without realizing that they are doing so) that linear approximations will do when analyzing infectious diseases. That hope is supported by the fact that the use of linear statistics in many infectious disease investigations has led us to discover controllable causes of infection.

The fact that standard methods have led us to the control of many infectious diseases is indeed an indication that we should not abandon their use. But two things should be recognized about this limited success. First, standard methods have been successful mainly when the causal effects being discovered are very strong. When causes are strong, standard methods can still lead us to our target even when the linear approximation is way off. You can hit a huge target even if your thrust is very inaccurate. Difficulties in discovering new causes or making judgements about causal actions will arise most often when causal actions generate odds ratios of less than the 10 to 100 value variety that are often found for many infectious transmission risk factors.

Second, we need to recognize that success in finding risk factors doesn't mean we will have success in predicting the effects of control programs that eliminate those risk factors. Even for infection risk factors that are readily discovered using linear models, those same linear models will distort the predicted effect of control programs to eliminate those risk factors.

The infection model we just presented had no immunity. The existence of immunity is another reason that non-linear infection models may be needed to discover controllable risk factors for infection. The reason was presented earlier. When the risk factor status of individuals stays constant, both infection and immunity become associated with the risk factor and these associations can cancel each other out so that there is not much of a difference in infection rates between exposed and unexposed individuals. How transmission system models that incorporate immunity help us around this dilemma is an issue that must be left until later.

Output must be viewed at population level

An important consequence of non-linear dynamics is that an understanding of what determines infection levels cannot be gained by merely investigating the risk experience of individuals. When dynamics are linear, it is possible to understand the causes of disease in this way. When dynamics are linear, the determinants of disease levels in the population can be assessed as sums of determinant effects experienced by individuals. If 100 individuals reduce their risk of heart disease by changing their diet, the total amount of risk in the population is reduced by the amount that these individuals reduced their risks. The actions of these individuals have no implications for the risks of anyone but themselves.

Consider, on the other hand, the case of 100 individuals reducing their infection risks. If any of these 100 individuals played key roles in sustaining the circulation of infection in the population, then it is possible that reducing risk in these 100 might stop transmission in the population and reduce the risk of millions. The same might be true if no one individual plays a key role but the 100 as a unit play a key role. One cannot determine whether there will be any such dramatic effect just by studying the risk experience of these individuals. One must also know how those individuals are connected to each other and connected to the rest of the population. What happens to the population cannot be determined just by studying what happens to each individual in the population. It also depends upon the extent to which the experience of each individual affects the risk of each other individual.

The way to assess these population level effects is through analysis of the transmission systems. One need not know exactly how each individual is connected to every other individual. One can start with a few general principals that ignore much of the detail that might affect the result. We will learn a few simple methods of transmission system analysis in this course.

Potential for Non-intuitive Effects

One of the greatest values of learning transmission system analysis methods and conducting a few simple transmission analyses is to improve one's intuition as to how transmission systems will behave when some force is exerted upon them or when they change. One's intuition can be quite wrong about these things. Our genetically supplied thinking material is set up to handle problems that are quite different from the behavior of complex transmission systems. Transmission systems have a lot of positive feedback loops as current infections generate new infections. They also have negative feedback loops as infections generate immunity. Some segments of a population will have been affected by immunity and be immune to the positive feedback and others won't. Because of this, one cannot say that reducing a risk in a group will always have observable benefits for that group. Sometimes the positive feedback loops will wipe out any benefit of the risk reduction and sometimes the negative feedback loops could be lost in a manner that could be harmful. Likewise reducing a risk in one group is unlikely to always be good for all other groups.

Surprising effects occur as the result of non-linear dynamics even when infectious agents are not adapting to escape the measures that humans use to control them. Further surprises can arise from agent adaptations. Simple transmission system analyses treat the agents as being static and homogeneous. The reality is that almost all populations of agents are highly varied and are undergoing dynamic changes as the result of mutation, transfection, conjugation, and numerous other modes of genetic variation which have evolved over the 4 billion years since the first cellular life forms appeared.

The great diversity of infectious agents may generate phenomena that we cannot examine unless we include high levels of diversity in our models. But important effects can be observed even at low levels of diversity. Models with two agents may be sufficient to improve our insights in some cases. In such models one encounters several surprises that are likely not to have been considered as possibilities without a formal transmission system analysis. Later in the course, if we progress rapidly enough, we will examine a simple two-agent model where antibiotic usage affects the emergence of antibiotic resistence. Transmission system models will help us identify issues in deciding upon antibiotic usage that might not be immediately intuitive.

Linear systems rarely challenge our intuitions to the extent that non-linear systems do. In order to avoid bad policy decisions regarding infectious diseases, it is good to try to train one's intuitions to be as accurate as possible. This also helps in identifying productive research agendas. As one builds simple transmission system models, one should always test one's intuition as to how that transmission system will behave under certain circumstances. Using Stella IIÔ , as we do in this course, one can train one's intuitions by drawing out by hand how one expects a simulation to behave before one hits the run button. When the system behaves differently than one expected, one can then strive to understand what caused the difference. The use of simulations as a training tool in this way may be their greatest value. There are many other benefits as well. We now consider some of them.

The Uses of Transmission System Analysis

Transmission system analysis is useful to both infection control practitioners and scientists who seek to understand how risk factors and contact behaviors generate patterns of infection in a population. It is characteristic of epidemiology to link science and public service. I divide the uses of transmission system analysis into four broad categories: 1) Public Health Policy Development, 2) Prediction of Control Program Effects, 3) Data collection, and 4) Study design. I further subdivide each of these categories.

Public Health Policy Development

Policy Decisions

Public health officials and others seeking to control infectious disease face many policy decisions that should be informed by an understanding of transmission system behavior. For example, they must often decide where to use limited resources to maximize infection control. Because maximal sanitation covering all the population can never be afforded, policies must be made as to what sanitation measures should be taken first. Likewise, optimal surveillance to find and treat all individuals who may be carriers of an infection can never be afforded. Policy decisions are required to decide where infection efforts should be concentrated and what infection detection methods should be used.. Even with vaccination, decisions must be made as to what is to be done to reach those who are not reached by routine programs.

Another policy decision that requires a thorough understanding transmission systems is when to call for drastic actions to prevent an epidemic before it reaches an uncontrollable state. Such decisions are not confined to rare events that are well publicized like the decision to vaccinate the U.S. against Swine Flu or the decision close bathhouses that might be foci for HIV transmission. I suspect that every day there are many health officers and infection control officials in hospitals that make decisions about whether some extraordinary action should or should not be taken to control the threat of infectious agent transmission.

Public policy decisions on complex issues are more solidly based when they proceed from an understanding of the basic issues affecting a decision than when they proceed solely because of political pressures or solely on the basis of computer predictions. I am talking now about the type of public policy decisions in which you will be involved. I am not talking about decisions made by individuals who are primarily politicians. Infection control officers need to integrate politics, practical understanding, and quantitative methods into their policy decisions. It may be natural to think that the way models contribute to policy decisions is by defining quantitative effects. I argue, however, that the greatest potential contribution of models to policy decisions is to increase the practical understanding of infection control policy makers.

The reason that understanding is more important than quantitative predictions in influencing policy is that the policy maker has to bring along a lot of people to the point where they will understand the policy decision enough to support it. When the only tool the policy maker has to do that is a number pulled out of a black box simulation, getting other people to support the decision will be justifiably difficult. When the policy maker can use arguments that derive from insights gained from a transmission system analysis, success is more likely. I admit that I may be pursuing a pie in the sky here, but I hope to make a valuable point that will at least make the epidemiologist think about the use of models even if it doesn't

Black box decisions can cover up value issues involved in the decision. For example, decisions as to what age groups should be the target of influenza immunization must specify an "objective" function. The objective measured by the function may be days of illness averted, years of life saved, numbers of lives saved, or effects upon the overall production of goods and services by the population. Different objective functions will lead to different decisions (8). Models can be used to highlight value issues that were previously ignored. A good example is the case of the models used to address the issue of age targeting for influenza immunization just discussed. But once the value issues are raised, the models themselves cannot be the basis for the decisions made.

The potential of transmission system models to hide important issues involved in policy decisions does not mean that computer models should not affect those decisions. It means that the way they should affect those decisions is by increasing understanding of the issues involved, highlighting the implications for policy of different value decisions, and facilitating the communication about what is most affecting the policy decision. Transmission system models should not be used in a black box manner to "make policy decisions that are objective and scientific". They should be used to create understanding and facilitate communication about the issues involved in the decision.

Another reason to avoid black box decisions about the control of transmission systems is that they have a good chance of being wrong. Transmission systems are complex and models of transmission systems must ignore much of this complexity. Sometimes the simplifications used in transmission system models can lead to bad decisions. Focusing on the complexities that the transmission systems models ignore might be as important to the decision process as obtaining insight from transmission system model analyses.

The fact that decisions indicated by a model analysis have a good chance of being wrong does not, of course, mean that analyzing the transmission will lead to worse decisions than not analyzing the transmission system. It just means that the model analysis must be a complete one with open and creative thinking. Transmission system analysis should not just consist of following rigid methods for the analysis of a transmission system model. Exploring complexities of the real world that the model does not consider should be a creative process utilizing the broad experience of diverse individuals.

Using models to facilitate communication and understanding so that decisions can be taken in an integrative manner is not part of traditional training in epidemiology. In fact, the analytical traditions of our profession support a formalistic process of decision making rather than an integrative one. For example, some of our traditions support the use of significance tests in decision making. Since we are advocating a change from these traditions, let us comment upon them to make the context of the change clearer.

Epidemiology, along with much of the rest of medicine, has adopted frequentist traditions in statistical analysis. Frequentist models for decisions are based on sampling theory. In the frequentist tradition of significance testing, one sample is compared to another and a decision is made as to whether the populations from which the samples are drawn differ in specified ways. Other than specifying the action of confounding variables, the causal structure generating the data is ignored in making the decision on how to respond to the data. This tradition has its utility. Significance tests are of great utility for technical decisions, like whether drug A or drug B is more effective in a defined set of circumstances. Policy and science decisions, on the other hand, require an inference process that is not well served by frequentist logic. The reasons are somewhat different for science and policy decisions. We will deal with science decisions in a later section.

Policy decisions don't fit well into frequentist decision processes because policy decisions require that many factors be balanced. The statistics designed for up or down decision criteria do not help assess that balance. Technical decisions, on the other hand, often only involve simple choices so that quantitative decision methods might be more appropriate.

Like technical decisions such as the use of drug A or drug B, policy decisions on complex issues require assessments of how much error there is in the available data. But policy decisions, in addition, must be capable of integrating many disparate types of information. They must assess the importance of errors in each of the elements contributing to the decision. They must use either some mental or some more formalistic model to integrate the disparate types of information. They must weigh both how each type of data should influence the decision and how competing models should influence the weighting.

The primacy of balancing diverse information and possibilities is especially the case for infection control decisions. This is true whether the decision involves national policy, allocation of resources in local health departments, or isolation and hygiene decisions in the hospital. For either mental models or more formalistic models used in making infection control decisions, transmission system theory must inevitably underlie the decision process. Bad decisions will be made when bad theory underlies the decisions. When practitioners do not think that transmission system analysis is relevant to decisions that will alter the transmission system, the likelihood is that bad theory will be used.

When it comes to infection control decisions, an inadequate understanding of the transmission systems handicaps many epidemiologists. They may lack the insights to see how one factor will have strong leverage because it either acts near a threshold or at a critical juncture while another factor will not much affect dynamics because it acts far from threshold or upon a redundant transmission path. Consequently, they may weigh the different types of evidence that should influence their decision in an inappropriate manner. We should discuss in this course a case where the EPA is weighing the costs and benefits of stopping Cryptosporidia transmission by either putting water filters in the homes of high-risk individuals or improving the water supply of entire communities. Hopefully the understanding you will get about transmission system analysis in this course would keep you from making the very serious mistake which the EPA seems about to make. They are opting for the home filters because they are not considering transmission system dynamics in assessing the benefits of improving community water treatment systems.

Not only is there no tradition of teaching transmission system analysis in epidemiology, but during their training epidemiologists are imbued with an approach that disregards transmission system behavior and acts as if what happens at a population level is just the sum of what happens to individuals. Consequently, cost-benefit considerations are often based on comparing the costs and benefits to the individuals directly affected by an intervention such as a treatment or a vaccination program. Even a major text on "Prevention Effectiveness" (9) advocates evaluating the costs vs. benefits of vaccination programs in a manner that disregards crucially important transmission system effects. They advocate calculating benefits as the vaccine effect on susceptibility times the number of individuals vaccinated times the attack rate before vaccination. The chance that vaccination will slow agent circulation and subsequently decrease exposure to the infectious agent is disregarded. Likewise, cost benefit evaluations of sanitation programs are prone to ignore transmission dynamics and lead to bad decisions, as in the EPA and Cryptosporidia case that we will discuss.

Despite the fact that epidemiological training is usually not designed to enhance the integration of multiple factors into a decision process consistent with the reality of transmission system dynamics, epidemiologists naturally employ many transmission system concepts when they make infection control decisions. They often consider indirect effects on persons who will not directly experience the intervention they are considering. They often consider interventions in some populations to be particularly important because stopping transmission in these populations will indirectly protect other populations. Sometimes they even consider how an intervention will eventually affect the age distribution of infection in the population. But all of these considerations can be tricky and fraught with hazard. A better understanding of the principles of transmission system dynamics can enhance the chance of a good outcome.

 

Predict Effects of Control Programs

Whenever choices are made between infection control programs or activities, some prediction of effects is likely to affect the decision. That prediction rarely includes the formalized use of models. It will be a better prediction, however, if the intuitions of the decision-maker have been informed by an understanding of the basic behaviors of transmission systems. Formal construction and analysis of a transmission system model can further enhance the quality of predictions used in decisions regarding control programs.

 

Prediction requires analysis of many models with many parameter values

The models used for prediction should be complicated by all details that have big effects. Details that do not have large effects should be left out. But how can we be sure the models have not thrown out the baby with bath water? Basically, more than one model should be constructed so that the influence of model form on the decision can be assessed. For each model, a broad range of possible parameter values should be specified. Predictions should then be made comparing transmission with and without the alternative interventions under consideration with different models and different parameter values. All of these various predictions should not be combined quantitatively to give one single prediction of effect. The results of the different model analyses should be used to generate insights about how various factors affect the transmission system. This is the best way to maximally inform the decision.

If multiple models need to be used to make all predictions and if multiple parameter sets in each model must be examined, you can see that models just can't be used to make every decision. But some policy decisions involve tens of millions of dollars of expenditure and slight differences can result in many deaths and disease. In these cases, models should be formally used.

Most often when a whole range of models is analyzed, some surprising behavior of the model is noted that makes one rethink the basis for the decision. Oftentimes when surprising behavior of the model is noted, one pursues an analysis of the transmission system model that eventually provides an explanation for this behavior. That explanation represents a new understanding of the transmission system. It might change the framework in which the original decision was cast.

It is best to avoid making decisions on the basis of predictions from the single best computer model that one can construct of a transmission system. It might seem that the best model is the best basis for prediction and therefore the best basis for decisions regarding infection control programs. But the best basis for decision comes from the most complete understanding. When one examines the behavior of a transmission system under a variety of conditions, one advances ones understanding. But one's understanding of very complex systems that are greatly simplified by our current level of models is rarely sufficient to justify placing great confidence in the prediction of any particular model.

 

Precise prediction is nearly impossible

Precise prediction will be problematic for transmission systems in the next few decades for three reasons. First, transmission system behavior always depends upon a number of things that are very poorly described or upon which very little data exists. This especially applies to contact patterns. Very often there are unmeasured differences in the way populations cluster and the extent of contact between clusters. These differences can sometimes make big differences in the behavior of transmission systems. Likewise, unknown differences in the pattern of dispersion of contacts in a population can make big differences in the ability of a population to sustain circulation of an infectious agent or to rapidly disseminate an epidemic. When system behavior is sensitive to unknown parameters, even the best model does not inspire a lot of confidence.

Second, the non-linear nature of transmission systems creates considerable sensitivity to initial conditions in the system. By initial conditions, we do not mean the model form or the parameters of a particular model form as discussed in the last paragraph. Initial conditions specify how individuals with various degrees of infectiousness, susceptibility, and immunity are distributed in the population at the beginning of a transmission system model analysis. The sensitivity to initial conditions relates to the concept of chaos. Models of measles with seasonal variation in contact rates or transmission probabilities can have long runs of biannual epidemics before any irregularity in this biannual pattern appears. If one changes the starting conditions by adding just 1% to the infectious population and subtracting just 1% from the susceptible population, one can get fairly long runs of relatively small annual epidemics before a biannual epidemic pattern appears. Such sensitivity is always a threat because we know it is explained by the fundamental nature of the system. Thus, one can only be sure that an analysis one has performed is not subject to such sensitivity by examining a variety of models under a variety of initial conditions.

Third, the performance of real transmission systems is always influenced by stochastic or chance determined events. In this text, we only examine deterministic models. Chance plays no role in these models. Every time these models are analyzed, the results are identical. Stochastic models must be used to assess the role of chance events that might change the course of an epidemic. Stochastic models by their very nature have to be analyzed by examining the full range of conditions that might come out by chance. Thus, they naturally enforce our admonition to examine system behavior under a variety of conditions. Prediction in stochastic models, moreover, is not of some precise outcome. It is of different probabilities of different outcomes. The whole topic of stochastic transmission models is beyond the scope of this course.

After making all of these caveats about not putting too much weight on the predictions made by transmission system models, we should reiterate that using models is better than not using models. Predictions made after examining the behavior of transmission models at a variety of settings will be more precise than predictions made on the basis of intuitions that are not informed by model analysis.

 

Three Procedures for Predicting HIV infection

Transmission models are not the only way to make predictions regarding future frequency of infectious disease. Other prediction methods might, in fact, be more accurate than dynamic transmission models for short-term predictions. We consider here the case of HIV infection.

Three different methods have been used for making predictions of HIV infection rates. These are:

  1. projections from transmission models
  2. curve fitting methods
  3. back-calculation methods

The first is the method we have been discussing. It involves modeling the transmission system, setting parameter values for the transmission system, setting initial distributions of individuals by infection-immunity status, and then projecting the behavior of the model from those initial conditions. Various types of data may be used in setting parameter values. For example, surveys of new sexual partnership formation rates and the frequency of concurrent partnerships may be used to define model parameters. Another way that data might be used is to adjust parameter values until the behavior of the model reproduces the documented patterns of infection that occurred in the past.

Curve fitting models do not model dynamic system behavior. They are simply polynomial functions with enough parameter flexibility to fit a very broad range of curves. In the 80's, some very prominent epidemiologists used functions that had served to fit a variety of past epidemics in this way. When they fitted these functions to the reported patterns AIDS cases, they concluded that new AIDS cases would soon head into a rapid decline. No transmission models predicted this. The superiority of the transmission models was clear. One reason the curve fitting models did not work as well for HIV as for some other epidemics is that the transmission dynamics of HIV are much more complex. The natural history of HIV infection is one where contagiousness varies markedly with time. It has both early and late contagiousness peaks. Moreover the very disperse and heterogeneous contact patterns that spread infection give HIV a distinctive ability to continue moving from one group to another within any defined geographic region.

The contact patterns which cause HIV to move at different speeds through different segments of the population were as much a problem for the transmission system models used in the 80's and 90's as they were for the curve fitting projections. There was just very little known about the population patterns of contact that could be used to accurately predict future infection patterns. Different transmission models that included different contact patterns projected very different future courses for HIV infection. But the reason that none of them had the dramatic drop seen with the curve fitting method was that the transmission models incorporated a reasonable formulation of the natural history of infection while the curve fitting formulas did not.

The method that is most accurate for a short-term projection of new AIDS cases uses information on the natural history of infection in a more complete fashion than the early transmission dynamic models did. This method is called back calculation. It works to project future patterns of AIDS cases rather than future HIV infections. Essentially what it does is to reconstruct a curve of new HIV infection onsets in the past using a curve that describes the distribution of times it takes HIV infections to turn into AIDS. Then it uses that curve of HIV infections together with the curve of distribution time to AIDS to again predict the onset of AIDS. For projecting AIDS cases two years into the future, this procedure requires no projection of future HIV infection patterns. In fact, it only uses projections of HIV infections up to two years before the present. Thereafter it does require projections. By combining this method with the projections from transmission models, one gets the most accurate predictions. The reason for the improved accuracy of the combined methods is that such combined models maximally use all of the data available. The transmission models tend not to use information on the distribution of time to AIDS as fully as the back calculation models but they use information on basic parameters more fully.

Scientific Theory Development

Models Provide a Framework for Developing Scientific Theory

I am a modeler because I want to advance theory that underlies the practice of epidemiology. Models are theories and theories are models. The degree of mathematical explicitness and/or computer implementation of theories and models may vary. Not all theories are computer models or mathematical models. But they are models none-the-less. We advance scientific knowledge by exploring what is right and what is wrong about theories and about the models that make those theories more explicit.

Statements like exposure E causes disease D are theories. But theories like this are not elaborated enough to advance science in the ways that theories should. Simple theories of the "E causes D" sort currently dominate epidemiological methods. They do not serve to point us to where we should seek new knowledge as more complete theories do. The do not make explicit what we do and do not know about disease causation. They merely express a belief that some factor plays some role somewhere in the system. Both infectious and non-infectious diseases need theories about the causal systems generating disease. For infectious diseases, all causal models are by their very nature embedded in transmission system models.

We commented earlier that formalistic decision processes such as significance tests are designed to address narrow technical decisions that do not require a complex balancing process. We discussed why quantitative decision processes are not well suited to addressing policy decisions. Science decisions don't fit well into formalistic quantitative methods either. The reason is that science decisions deal largely with how to develop new theory or how to weigh data as support for one or the other of alternative theories. Decisions to pursue or accept one theory or another are different from decisions to go with one technical choice or another. Theory leads to action. But the choice of the best theory uses a different process than the choice of the best action. The best theory leads to the most understanding and the best definition of where to search for unknown truths. The best theory is not always the one whose mathematical formulation leads to the best fit to a particular set of observational data.

Choosing the theory one is going to work with to advance science has some similarities to making a policy decision about a complex system on which one has only rudimentary information. It requires one to take an integrative approach that balances different values, different objectives, and consistency with diverse sets of data and pre-existing theory. Likelihood inference processes can be used to determine the theory with the highest likelihood of generating observed data. But there is no way to determine which line of theory can be most productively pursued to advance science. A more integrative approach to theory choice uses one's understanding of causal systems to nuance one's understanding of how data should affect theories and decisions regarding how to treat hypotheses.

Analyzing transmission models to gain insights into what determines infection patterns by person, time, and place serves not only to improve infection control decisions. It also serves to suggest scientific theories and hypotheses to be explored with empirical data. When various alternative theories are suggested, transmission models then serve to help define the conditions where the transmission system models make different predictions and thereby where observations could help determine which theory best represents reality. In other words, transmission system models are the basis for a science of transmission system analysis.

Models Promote Understanding and Provide Explanations

In expounding upon the use of transmission models for policy decisions and for predicting infection patterns, we have already presented our view that the major use of transmission system models is to improve the understanding regarding the mechanics and dynamics of transmission systems. This will be the major way that you will learn to use transmission models in this course. You will have to predict system behavior for artificial transmission systems. Then when you find your predictions to be inaccurate or incomplete, you will have to struggle to understand where you went wrong and what aspects of transmission system behavior you failed to understand.

Starting simple and proceeding to the more complex only when the simple is well understood is the best way to advance understanding. The simple is always a bit unbelievable. In this course, we will stick with the extremely simple. Your job is to understand what is happening in the simple models we examine. Understanding what is happening in these simple models does not often give you an adequate understanding of what happens in the real world. But such understanding is a first step toward a more adequate understanding.

 

Models Help Us Identify Currently Ignored Causes of Infection

Transmission system models lead epidemiologists to look for the causes of infection patterns in the conformation of the system instead of just in things to which individuals are "exposed". Environmental contamination, exposure behaviors, and genetic factors are causes in both transmission system analysis and in standard epidemiological analyses. These are things on which individuals can be classified as exposed or unexposed. Transmission system analysis, however, helps us appreciate that sometimes the most important causes are not characteristics of the elements that make up the system, but how those elements are put together.

Thinking that all causes act directly on individuals is a strong tradition in epidemiology. Even in analytical models that include separate population levels and individual levels, the assumption is made that all causes act upon individuals (10). Working with transmission system models can help epidemiologists to understand causal effects that are manifest through the system rather than directly upon individuals. They can help epidemiologists see causes that arise from the arrangement of contacts between individuals that cannot possibly be ascertained by separate detailed histories of the contacts of each individual.

There are numerous causes of infection levels in populations intrinsic to the arrangement of the elements in a transmission system. Formulations like the differential equation set 2 above intrinsically make the pattern of who is in different stages of infection an important determinant of the population level of infection. The way that immunity can be added to such equations provides another important population level determinant of population infection levels that cannot be assessed merely by treating each individual as unconnected to other individuals. But until one gains experience with transmission models, the way that these formulations generate supra-individual causes of infection levels may not be too obvious. Therefore, we focus on one of the major population level determinants of infection that does not require as much sophistication to appreciate. This is the arrangement of contacts between individuals. Figure 1, for example, presents two alternative such arrangements.

Much of the work done on transmission models at Michigan has focused on demonstrating the great importance of contact patterns in determining levels of infection (5,6,11-32). The conclusions regarding this importance are further substantiated by the work of many different investigators (33-39). For HIV and STDs many different aspects of the pattern of contacts between individuals are important. One important aspect is how populations of individuals with low rates of new partnership formation get connected to populations of individuals with high rates of new partnership formation. Another is the concurrency of partnerships and how individuals with concurrent partnerships get connected to each other. But the easiest issue to understand is just that illustrated in figure 1. It is how individuals with the same level of partnership formation get connected to each other.

In models of transmission systems that, unlike those considered in this course, have discrete individuals, a random process can generate the connections between identical individuals. Chance can then lead to the patterns seen in figure 1 as two extreme examples. In these examples, the largest components are of size 3 and 12, and the numbers of separate components are 4 and 1. But most of the time chance will lead to some intermediate number and size of components.

Ghani, Swinton, and Garnett (34) modeled simple gonorrhea transmission systems consisting of discrete individuals in which they allowed the sexual behaviors of individuals to vary by chance. They also let chance determine the person with whom those sexual partnerships were formed. They composed many populations using these chance processes and then ran simulations. Each population was then used as a case in a traditional regression analysis. The randomly varying number of partnerships and number of different types of sex acts were entered as predictor variables along with the size of the largest component. The infection level in the population was entered as the dependent variable. The independent variables in their comparisons across populations were the randomly determined sums of partnerships and different types of sex acts in the population along with the size of the largest component. They tried two different ways of measuring the largest component. One method used all partnerships at a point in time. One used the partnerships accumulated across 90 days. Neither of these variables would theoretically reflect the contact patterns most capable of spreading infection but they do capture some aspect of contact patterns related to this potential. Dependent variables used in their analyses included individual infection outcome, whether populations sustained transmission, and the level of transmission sustained.

The major conclusion of their analysis is that the existence and level of gonorrhea infection in a population is determined by the pattern of connection between individuals. This pattern is a much stronger influence than is the average rate of partnership formation or the variance in the rate of partnership formation in the population. Different aspects of the pattern may be involved in determining whether transmission is sustained at all and the prevalence of infection if transmission was sustained. Network measures that could be approximately ascertained without extensive contact tracing were predictive of the prevalence of infection in populations simulated. Additional measures that would require contact tracing were even more predictive.

The model analyzed by Ghani et al did not have parameters determining realistic contact patterns where heterogeneous individuals mixed according to specified preferences. They only had a simple random partnering process. Thus, they could only let outcomes of a homogenous contact process vary by chance. We are extending their analysis using models with more realistic contact processes where we can let those processes vary by chance. We hope that our analysis will better measure the effect of contact patterns, specify what aspects of the contact patterns have the most influence, and explore what data that can be collected in the field can most feasibly reflect these effects.

The work of Morris et al. has already demonstrated that concurrency patterns are an important determinant of infection levels (35,36). The concurrency patterns they demonstrated to have importance are quite feasibly measured in the field. We hope our analyses will show other aspects of contact patterns that can be measured in the field and which are strong determinants of infection level. This then begins the process of focusing on causes that cannot be measured by classifying individuals as exposed or unexposed to a risk factor.

The work of George Kaplan and John Lynch on income disparity (40,41) is another example of a population level factor that cannot be measured on individuals. These investigators examined mortality and income levels in 283 metropolitan areas in the United States. They found that income disparity was a strong predictor of mortality. The ratio of the income level at the 90th percentile to the income level at the 10th percentile was a strong predictor of population level mortality even when controlled for the number of poor people in the population or for average income levels. Such a finding fits in well with what is known about transmission dynamics of infectious diseases. Heterogeneity of transmission risk can contribute both to sustaining the circulation of an infectious agent in a population over long periods of time and to increasing the exposure to an infectious agent in individuals at low risk.

Non-linear models of population processes, by their very nature, have causes of population outcomes that cannot be measured just by observing individuals and ignoring their linkages with other individuals. As we saw above, the nature of non-linear population models is that they link individuals. If we assume that individuals are linked randomly, then observations on individuals are sufficient to measure the population pattern of linkage. But no contacts involved in the transmission of any infection are made randomly. The non-random pattern of contacts makes a big difference for transmission 4,5,10-38).

Infection transmission is not the only event in human interaction that can generate non-linear population models. Other intrinsically non-linear population models would include those where one person generates stress in others, where one person threatens others with violence and stimulates them to buy guns, where one person teaches another or gives them information, and where one person offers support to another. Non-linear dynamic models encapsulating these processes are likely to give informative intuitions and provide surprising insights just like models of transmission have. In fact, models involving some of these processes have been developed by sociologists that could be but have not been applied in epidemiology (42). Models of transmission systems are the models involving social processes for which we currently have the most theoretical and data support. Thus, I encourage those who are interested in these other social questions to contribute to the science of transmission system analysis so that they can gain insights to address these other problems which might be of interest to them.

 

Models Help us Identify Inconsistencies in Our Knowledge or Beliefs

We often have observations on the pattern of exposure, the pattern of infection, the pattern of disease, and the extent of causal effect that we think exposures are having on infection and/or disease and sometimes rates of contact that spread infection. If this knowledge is not put into a transmission model, it is quite likely that investigators will accept estimates of exposure frequencies and exposure effects that are wholly inconsistent with available observations on infection and disease patterns. With non-linear systems, it just may not be very intuitive what different estimates are consistent or inconsistent with each other. Putting things together in a single model is often the only way to recognize certain inconsistencies.

Recent work on models of gonorrhea have pointed out that our estimates of transmission probabilities, rates of acquiring immunity, frequency of asymptomatic infection, and duration of infection that is treated or untreated are inconsistent with the pattern of infection observed in populations. To make standard gonorrhea models consistent with observed population prevalence of infection, the transmission probabilities and frequencies of asymptomatic infection must be much higher than our observations on these parameters. Even with transmission probabilities and asymptomatic infection rates set at unrealistic levels, transmission models cannot reproduce the observed frequency of infection as a function of partner change rates. The models project infection rates in the high contact groups that are higher than we observe and infection rates in the low contact groups that are lower than we observe.

Now sometimes when such inconsistencies are noted, one may call into question the observations. This happened in the case of HPV in relationship to cervical cancer. The incidence of specific genotypes of HPV, their prevalence, and their duration of infection were measured in numerous studies and no notice of the inconsistency of these measures was noted until they were put together in a single dynamic model. The temporal pattern of changing force of infection that would be required to make observations compatible with each other was quite improbable. In this case, inadequacies in detecting HPV at low levels were judged the best explanation for these inconsistencies.

In the case of gonorrhea, however, I suspect that the observations are not all that bad and that we need not discard our observations to explain the inconsistencies. Instead, I think that the standard gonorrhea model is wrong. The standard gonorrhea model has a single gonorrhea agent that stimulates no immunity at all. But recent observations using molecular measurements to distinguish different strains of gonorrhea indicate that each gonorrhea strain stimulates some immunity. There are so many different strains, however, that failing to distinguish strains make it seem that no immunity is acquired at all. By changing the gonorrhea model to include multiple partially cross-reacting strains, it should be possible to make all current observations fit into the model.

 

Models Help Us Identify What Is Known and What Is Unknown About A Subject

The standard approach to epidemiological risk factors is to make a laundry list of exposures that can lead to disease, collect data on the association of those exposures with disease, and go through an inference process which selects those exposures one is willing to call causes. With this approach, there is no way to know when the list is complete. Even when one factor explains 100% of the variance in occurrence of a disease, other important risk factors might still be discovered. For example, early studies of monozygous and dizygous twins explained almost all of the variance in polio occurrence. Polio is very strongly determined by genetics. When everyone gets infected, only genetic variance explains population variance. But infection is still the most controllable cause of the disease. Thus, the standard approach to finding causes in epidemiology neither provides guideposts pointing out where causes should be pursued nor criteria to know when the goal has been reached.

Constructing a transmission system model is a better way to organize the search for causes. To construct a transmission model, one must put all the essential elements into the model. Transmission system models demand a contact process, a transmission process, a disease progression process, and an immunity process. They often demand that the population be classified into various states of susceptibility, contagiousness, and immunity. Birth, aging, and death processes often demands clear expression. Until one starts to construct contact processes between groups, one is unlikely to think at all about the nature or the determinants of such processes. But when one thinks about them, their key roles in transmission systems become evident.

The construction of such models makes the modeler quite aware of their ignorance. It drives one to the literature to glean what is known about the various parameters of the model. It often happens that the model demands parameters for areas where there has been no inquiry at all.

There is not a standard process of model construction to follow in order to define what is known or unknown about the transmission of an infection. If one accepts the simplest of models as the definition of what is needed, the gaps that the model can point out will be few. As the model is elaborated to better approximate reality and to more completely incorporate things that could be important, more gaps will be pointed out.

 

Data Collection

Models Help Us Specify What New Data Will Be Most Valuable to Gather

Everything that we know about transmission systems, we know with a degree of imprecision. The infection rates given exposure to different levels of agent via different routes are always known imprecisely. The durations of infection and how the level of excreted infectious agent varies over the course of infection are always known imprecisely. If agent survives in the environment, the determinants of its survival are known imprecisely. The contact patterns that expose individuals to infectious agent are known imprecisely. The effect of treatments on the duration of infection and the amount of agent excreted are known imprecisely. The effects of vaccines on transmission probabilities or on the course of infection once transmission occurs to a vaccinated individual are known imprecisely.

Where should we spend our money to obtain information that is more precise? Transmission system analysis has a valuable technique to answer this question. It is called sensitivity analysis. In such an analysis, model parameters reflecting these areas of uncertain knowledge are varied from the lower to the upper limits of what is consistent with available information. If the level of infection generated by a transmission system does not vary enough to affect our control decisions, then our decisions are insensitive to that aspect of the transmission system. In that case, there is no sense pursuing them. On the other hand, if our decisions regarding what control measures will work require more precise parameter estimates than we have available, then we should pursue knowledge about those parameters. Because of the non-linearity issues discussed earlier, the parameters to which the system is most sensitive are often assessed inaccurately when intuition is the sole basis of the assessment. Formal sensitivity analysis is one of the most important tools of transmission system analysis. Even in your very first exercise, you will perform a very simple sensitivity analysis.

 

Models Can Help Specify The Data Needed To Choose The Better Theory

Science often advances through the pursuit of competing lines of model development that have different underlying principles and visions of where they are heading. Competition between lines of model development can occur at different levels. At the broadest level, it may involve competition between paradigms like risk factor vs. social process explanation of disease patterns. At high levels of paradigmatic difference, little can move the adherents of different lines to drop their approach and adopt a competing line.

More commonly, however, competition occurs within a research program where the competing models are designed using similar principles and goals. Adherents of the same program might postulate different mechanisms for the way something happens or the way a system functions. For example, there may be different theories regarding the nature of gonorrhea transmission, infection, and immunity. The idea of sexual transmission may be commonly accepted, the idea that core groups are important for transmission may be commonly accepted. But there might be differences on the role of asymptomatic infections or of immunity in determining why the core group plays such an important role. Given similar principles and goals for model development, it is likely that both variants will explain commonly accepted observations. For example, competing epidemiological models for a particular infectious disease would generate similar patterns of disease in populations. That is because competing models would have to be consistent with some consensus within the research program regarding what has already been observed.

Consensus on what has been observed is never complete. Many times models might suggest that observations are wrong, as was discussed above. Thus it is not a hard and fast rule that there has to be complete overlap between the predictions made by competing models with regard to things that have already been observed. For example, not everyone may agree on the distribution of gonorrhea between populations with high and low rates of partner change because different individuals might suspect that different reporting biases could be affecting these observations. Thus, different models might generate different distributions of infection by new partnership formation rates. But if there is agreement on the goals and methods of the scientific pursuit, disagreement on evidence issues should be minimal.

To the participants in a dispute regarding competing theories, the distinction between high level paradigmatic differences and small differences within a paradigm might be lost. Passions can be intense in either case. But in the latter case, evidence can play more of a role in resolving differences and models can play a key role in determining where the evidence should be sought. Competing models can be analyzed to determine where they predict differences in observations that have not been made. Then research can be directed to precisely those conditions where the models differ in their predictions.

For example, a large difference in contagiousness between the early and middle stages of HIV infection has been postulated to explain the temporal pattern of the HIV epidemic. The epidemic rose quickly but then it slowed. It did not proceed to saturate all of the population practicing the behaviors that were found to put people at risk. A theory that was accepted too uncritically was that decreasing rates of high-risk behavior in the gay male population explained the change. But this explanation did not hold up to critical analysis. For any transmission system model to fit the observed pattern of the epidemic purely as a consequence of behavior change, the rate of behavior change would have to have been many times greater than was actually observed. A second theory that can explain this pattern is that HIV is much more contagious in the early stages of infection than in the later stages (28). Individuals in the stage of infection with low transmissibility then act like immune individuals in absorbing the transmissions from the highly infectious individuals in the early stage. Thus, each individual in the early stage of infection generates fewer new cases and the rate of epidemic rise falls. A third theory is that patterns of contact are such that the epidemic just never reached many groups with high-risk practices.

These different theories predict different rates of new infection in populations of young gay males who mix largely among themselves but have some infections in their group. The first theory predicts a rate that corresponds to the rate of transmission in older groups after their behavior changed. The second theory predicts a rate that is less than the original rate in the older group but higher than the rate after behavior change. The third theory predicts rates that correspond to the rate of transmission before behavior change. The summary just given considerably simplifies the differences for the sake of our discussion. Subsequent observations support the second theory. Discussion and controversy generated by competing theories has, however, led to considerable elaboration of the mechanisms postulated under the second theory. In the current state of that theory, both behavior change and contact patterns augment the effect of differences in contagiousness by stage of infection (30).

To summarize, the best way for models to compete is to make predictions about observations that have not yet been made. Analysis of transmission systems can specify the conditions under which competing models make different predictions about data that has yet to be collected. Data alone never resolves competitions between competing models. But new data that is consistent with one model and inconsistent with another certainly moves individuals with regard to their tendencies to continue pursuing one model or the other. As in the example of HIV transmission theories just discussed, the usual outcome is some blending of theories that have the capacity to make even more specific predictions than any of the originally competing theories.

As an addendum to the above discussion, we note that theories making more specific predictions, that is theories that open themselves more clearly to being disproved, are better and more powerful theories than those that make less detailed predictions. That is one reason why competition between theories is almost never resolved purely on the basis of data. It is rare that competing theories make equally precise predictions and therefore are equally subject to being disproved. For example, we will include few details in the simple transmission system models we initially study. We will postulate a mass action law for transmission. Competing laws for transmission might be that exposure doses are accumulated over time until a critical level is reached. Competing models with mass action or cumulative dose laws that do not specify patterns of infection by age or by variation in risk behavior will not make predictions that distinguish between these mechanisms. Only when the models get more serious and include more of reality will they subject themselves more seriously to disproof.

In the 70's, objections to the idea of generating a science of transmission system analysis were based upon the observation that the simple models in use at that time never subjected themselves to disproof by data (2). The potential to generate more detailed models that could subject themselves to disproof was pointed out at that time (3). But now at the end of the 90's we are just finally getting a tradition going where models that can be disproved are creating the basis for a science of transmission system analysis.

 

Study Design

Models Can Help Us Design More Efficient Studies

Transmission system models can do more than just point out situations where informative observations can be made. They can also help design studies that enhance the information obtained from observations in those situations. By building models of epidemiological studies on top of models of transmission systems, one creates the potential to analyze the joint models to determine how parameter estimates are affected by different sampling schemes. One also can use such joint models to determine how different degrees of data reliability or information bias affect parameter estimates and study conclusions. Likewise, one can determine how missing data from different types of individuals will affect parameter estimates or study conclusions. The ways that models can be used to improve study efficiency are quite varied.

Unfortunately, we will not learn about this in this course. That is because it is nearly impossible to build models of study designs on top of continuous deterministic compartmental models of transmission systems. These are the only types of models we study in this course. I believe that at sometime in the future discrete individual models of stochastic processes in transmission systems will be used to introduce epidemiologists to transmission system analysis. But currently the software for this type of model does not make learning them as instructive as learning continuous models.

 

Models Can Help Us Assess The Validity of Epidemiological Methods

A major value of discrete individual models of stochastic transmission processes for epidemiological education is that such models allow us to see the consequences of the assumptions we make in our analyses. In such models, each event that causes a disease can be recorded and its contribution to the causal process can be specified for each individual where the cause acts. These exact causal actions can then be compared to the causal actions that we infer from the analysis of epidemiological data.

It is hard to learn about the performance of our epidemiological methods by seeing how they work in the real world. That is because we never really know how the real world works. But we can know exactly how the artificial world in a computer model works. We can build complex computer models whose function is not intuitive. Then we can gather and analyze data from the artificial world in the same way we do in the real world and see how the conclusions from our methods differ from the real world. I believe that as we start doing this, we will see that we had better get to work on the task of devising study designs and analytic methods based on the reality of transmission systems rather than on the hopeful assumption that ignoring that reality does us little harm.

 

References

  1. Fox J, Elveback L, Hall C. Epidemiology
  2. Stille WT. Gersten JC. Tautology in epidemic models [editorial]. [Editorial] Journal of Infectious Diseases. 138(1):99-101, 1978
  3. Koopman JS. Models of transmission of infectious agents. J Infect Dis 1979; 139:616-7.
  4. Rothman K, Greenland S. Modern Epidemiology Lippincott-Raven Press 1998
  5. Koopman JS, Longini IM, Jacquez JA, Simon CP, Martin W, and Woodcock D. Assessing risk factors for transmission. Am J Epidemiol. 1991; 133(12).
  6. Koopman JS and Longini IM. Ecological effects of individual exposures and non-linear disease dynamics in populations. Amer J Pub Hlth 1994; 84(5):836-842.
  7. Holland JH. Hidden Order Helix Books (Addison Wesley) 1995
  8. Longini IM. Math Biosci 1972
  9. Haddix AC, Teutsch SM, Shaffer PA, Dunet DO. Prevention Effectiveness Oxford University Press, 1996
  10. Diez-Roux AV. Bringing Context Back into Epidemiology: Variables and Fallacies in Multilevel Analysis, Amer J Pub Health 1998;88:216-22
  11. Longini IM, Koopman JS. Household and community transmission parameters from final distributions of infections in households. Biometrics 1982; 38:1-12.
  12. Longini IM, Koopman JS, Monto AS, Fox JP. Estimating household and community transmission parameters for influenza. Am J Epidemiol 1982; 115:736-757.
  13. Longini IM, Koopman JS, Monto AS. Estimation procedure for transmission parameters from influenza epidemics. Use of serological data. Voprosy Virusologii. 1983; 2:176-181.
  14. Longini IM, Monto AS, Koopman JS. Statistical procedures for estimating the community probability of illness in family studies; Rhinovirus and influenza. Int J Epidemiol. 1983;148:284-91.
  15. Longini Jr. IM, Koopman JS, Haber M and Cotsonis GA. Statistical inference for infectious diseases: Risk-specific household and community transmission parameters. Am J Epidemiol. 1988;128(4):845-859.
  16. Jacquez JA, Koopman JS, Simon C, Sattenspiel L, Perry T. Modeling and the analysis of HIV transmission: The effect of contact patterns (1988). Math Biosci. 1988; 92:119-199.
  17. Koopman JS, Simon C, Jacquez JA, Joseph JG, and Sattenspiel L. Sexual partner selectiveness effects on homosexual HIV transmission dynamics. Journal of AIDS. 1988;1(5):486-504
  18. Jacquez JA, Simon CP, and Koopman JS. Structured mixing: Heterogeneous mixing by the definition of activity groups. In Mathematical and Statistical Approaches to AIDS Epidemiology. Castillo-Chavez C, ed. Springer-Verlag Lecture Notes in Biomathematics. 1989; 83:301-315.
  19. Koopman JS, Simon CP, Jacquez JA and Park TS. Selective contact within structured mixing application to HIV. In Mathematical and Statistical Approaches to AIDS Epidemiology. Castillo-Chavez C, ed. Springer-Verlag Lecture Notes in Biomathematics. 1989; 83:316-349.
  20. Longini IM, Haber M, and Koopman JS. Use of modeling in infectious disease epidemiology. Am J Epidemiol. 1989; 130:619-620.
  21. Sattenspiel L, Koopman JS, Simon C, and Jacquez JA. The Effects of population structure on the spread of the HIV infection. Am J Phys Anthropol, 1990; 82(4):421-430.
  22. Jacquez JA, Simon CB, and Koopman JS. The reproduction number in deterministic models of contagious diseases. Comments on Theoretical Biology, 1991; 2(3):159-209.
  23. Jacquez JA, Simon CP, and Koopman JS. Core groups and the Ro's for subgroups in heterogeneous SIS and SI models. In Epidemic Models: Their Structure and Relationship to Data. Mollison D. (ed.) Cambridge University Press, 1993. Presented at the NATO Workshop on Human Infections and Transmission Models. Cambridge, England, March 30, 1993.
  24. Koopman JS, Haber MJ, Longini IM, Simon CP, and Jacquez JA. Using transmission models to assess risk factors for transmission. In Epidemic Models: Their Structure and Relationship to Data. Mollison D. (ed.) Cambridge University Press, 1993. Presented at the NATO Workshop on Human Infections and Transmission Models. Cambridge, England, March 30, 1993.
  25. Simon CP, Jacquez JA, and Koopman JS. The Liapunov function approach to computing Ro. In Epidemic Models: Their Structure and Relationship to Data. Mollison D. (ed.) Cambridge University Press, 1993. Presented at the NATO Workshop on Human Infections and Transmission Models. Cambridge, England, April 1, 1993.
  26. Jacquez JA, Simon CP, and Koopman JS. Observations on CD4 count progressions in different stages of HIV infection. In Epidemic Models: Their