Models for Evaluating Network Statistics and Surveillance Procedures: Preliminary Results for Gonorrhea

Department of Epidemiology

By

Jim Koopman MD MPH, Dept. of Epidemiology

Steve Chick PhD, Dept. of Industrial and Operations Engineering

University of Michigan

Delivered in shortened form on April 28, 1999 in Santa Fe, New Mexico at the CDC conference on STD models

Summary

Discrete individual models of sexual pairing and infection transmission were constructed to help design surveillance systems for STDs and HIV that use information on partnerships of infected individuals. As part of this task, the models will be used to develop methods to analyze data on the geographic or social setting of partnership formation, courtship intervals, and other data that characterize the relationships between individuals. Such data is often used to assess individual risks but has little intrinsic value for that purpose. It is more useful for assessing population risks and for determining the roles that individuals play in amplifying or sustaining transmission. It is rarely put to these uses because methods like those we propose to develop are not available. Our development program pursues field methods and theoretical methods simultaneously. New field methods for collecting such data are being explored in Genesee County. The data will be used to detect otherwise hidden outbreaks and identify areas where surveillance is failing to detect infections. It will also be used to identify geographic areas and social settings where outreach efforts will be productive and to prioritize partners for contact tracing. When later expanded to the state level, it will serve to prioritize administrative districts for surveillance system and control resource allocation. Computer simulation will evaluate the utility of using different variables, analytic methods, surveillance system structures, and resource allocation strategies in response to surveillance system data.

Our model formulation allows for closed form solutions under various conditions. These solutions have provided insights into transmission systems where reinfection is frequent and have validated our simulation implementation. The model formulation also allows for detailed and realistic modeling of social and geographic patterns of contact and transmission. This allows for generation of data from systems with a realistic degree of complexity but where every aspect of that complexity is accessible to the investigator. By examining the behavior of statistics calculable from field data in such models, we will assess how to best use data to achieve our surveillance system objectives. We will also determine what conditions affect the utility of different statistics. The statistics to be evaluated include both simple summary statistics of surveillance data and network statistics. Because network statistics requiring contact tracing data are often impractical, we propose new network statistics that use social setting of partnership formation and courtship intervals that do not require contact tracing data.

Models generating data for a science of transmission system analysis

Transmission system analysis seeks to understand why infections spread in the geographic, social, and temporal patterns that they do and predict the effects of surveillance and control measures on that spread. Effective transmission system analysis requires an effective articulation between data and theory. Unfortunately, most infectious disease data gets analyzed using methods whose theoretical base excludes the possibility of infection transmission. In response to this inconsistency, we are developing models that generate the same sort of epidemiological data that is generated in the field. We will use these models to forge tighter links between epidemiological data and transmission system theory in the following ways:

It is this last purpose to which our current work is directed. CDC funds us to develop new ways to use data on interactions with partners in STD and HIV surveillance systems. Our development efforts proceed simultaneously in the field in Genesee county Michigan and in the computer. The surveillance systems we are developing will be more capable than existing systems of setting priorities for contact tracing and outreach activities to detect unreported infections. The data for these new surveillance systems include the geographic and social sites of partnership formation and the intervals between meeting and having sex with partners as well as a minimal set of characteristics of partners. Our joint field and computer work lead us to believe that we are getting close to developing surveillance systems that are considerably more effective in achieving infection control than current systems.

The models we are developing are formulated to meet our specific objectives while at the same time contributing theory and structure for other modeling objectives. Our models can assess the utility of using different sorts of epidemiological data in different ways because they have discrete individuals who keep their life histories with them. The major requirement for our model construction was that it generate epidemiological data of comparable complexity to that we will collect in the field through mechanisms that are causally plausible and that encompass the sources of diversity and interactions that could be affecting interpretation of the collected data.

 

Our approach to complexity

Tradeoffs between realistic complexity and interpretable simplicity are intrinsic to any modeling process. We have entertained various sorts of models that could generate the right balance between realism, complexity, and interpretability. We are attracted by agent based models where the rules of interaction lie wholly within the agent. But this sort of model generates unpredictable contact patterns and is difficult to fit to observed contact patterns. We want realism in the mechanisms generating the data, but our investigative focus is not upon those mechanisms but upon the use and interpretation of data. To get the balance we need between realism, complexity, and tractability for fitting to observed data, we used parameters for the mixing process that do not correspond to conceptually distinct forces acting upon individuals in getting them to make advances to others or react to the advances of others. In other words, we did not parameterize seeking and rejection or acceptance processes as we have in some of our other models. We feel that independent measurement of such parameters is too problematic. We instead chose a parameter determining partnership rates that corresponds to actual observed contact rates in some situations and that may have a higher or lower value than the observed contact rate depending upon whether the potential partner pool has higher or lower partnership potential. We have decided to name our parameter "partnership potential".

Our models balance realistic complexity with mathematical analyzability by

Because the mixing parameters in our models define potential for partnerships within specific social and geographic contexts, their distribution between different settings can be specified by field observations. This contrasts to our prior development of preferred (1), structured (2), or selective (3) mixing formulations. Our older mixing formulations, like those of Blythe et al. (4), Busenberg et al. (5), Castillo-Chavez et al. (6,7) Garnett et al. (8) and the Ghani et al. (9), have sought mathematical clarity and/or flexibility in the patterns generated. They have not, however, had parameters corresponding to observable phenomenon.

We expect to have simulation implementation of our models within a year that will enable us to pursue our objective of defining effective new surveillance system methods. Implementation is far enough along at this point to illustrate how we can verify model behavior with theoretical analyses and to demonstrate how we can explore effects on endemic infection levels for gonorrhea. Before presenting the model and some analysis of it, however, we want to elaborate on why we focus our model development on the use of epidemiological data.

The nature of transmission system data and risk factor data

Standard epidemiological data analysis methods arrange individuals into rows with both outcome and predictor variables for each individual being arranged into columns. The fact that standard analyses do not view individuals as being part of a system is manifest by the fact that the row an individual is in (the arrangement of individuals) makes no difference to the results of standard analyses. This two-dimensional data arrangement can be viewed as the face of a three-dimensional cube that represents the more complete data needed for the analysis of a transmission system. Such a cube is represented in figure 1

Figure 1

The three dimensional shape of epidemiological data

Both social network analysis and phylogenetic analysis are performed in the plane perpendicular to the plane of standard epidemiological analyses. For these analytic methods, the data are arranged as a square matrix with individuals along both axes. The values in the matrix represent degrees of connection between individuals.

Standard epidemiological analyses assume that this dimension where social network analysis and phylogenetic analyses are found is irrelevant. They assume that the outcome in one individual is independent of the outcome in other individuals. This assumption is inconsistent with the nature of infection transmission. That is why analyses that just focus upon characteristics of the host, characteristics of the agent, and characteristics of the interaction between them are incapable of predicting the population pattern of infections. It is also a reason why standard epidemiological analyses of infectious disease risk factor effects can lead to erroneous conclusions. The validity of the analytic results often depends upon clearly wrong assumptions.

When data relate specifically to the plane relating individuals to each other, as is the case for settings where partnerships are formed and the courtship interval, standard data analysis in the context of assessing individual risks is likely to be particularly unproductive. Analyses may be performed using such data for individual risk assessments. But contact data has limited usefulness for this purpose. It is more useful for population level analyses of the networks through which infection flows.

We can think of infection as flowing in the plane where phylogenetic and network analyses are conducted. It is not practically feasible to trace out the networks through which infection flows using current methods that require gathering data from the contacts of interviewed individuals. Our research goals include the development of practical methods for tracing infection flow in this social network dimension. Data on site of sexual partnership formation and courtship intervals provide information about the patterning of links in contact networks. Our current project seeks better use of this data.

 

Contact patterns as determinants of population infection levels

Numerous compartmental model analyses of transmission systems over the past decade have emphasized the importance of contact patterns as determinants of population levels of infection (1-14). Consider how infection levels can change without changing the hosts, agents, and pathogenesis or immunity processes in the transmission system in any way. Consider two populations that have identical individuals. For each individual with a set of risk factors and a history of partnerships made at specific times in the first population, there is an individual in the second population with exactly the same risk factors and history of partnerships. Thus, any description that ignores the connections between individuals in the population will find these two populations to be identical. They can differ only in the way individuals are connected in partnerships. This type of change, however, can be very important. By changing contact patterns, we can change the level of infection from zero to complete. To see this, consider a population divided into individuals who do and do not have a risk factor that increases the ability of an infectious agent to proliferate in them. If the group with the risk factor makes all of their contacts with individuals without this risk factor, and if these latter individuals are incapable of sustaining chains of transmission on their own, then infection might disappear from the entire population. If individuals with the risk factor make all of their contacts with each other, infection might flow quickly between them but not at all to the group without the risk factor. If they make just enough contacts with each other to sustain circulation between them, they might make enough contacts with the group without the risk factor to still infect them at a high level. To predict the infection level in a population, therefore, it is not enough to know about the risk factors of every host, the molecular variations affecting every agent, and every aspect of pathogenesis and immunity. To predict the infection level in a population, the transmission system needs to be analyzed. To make accurate predictions, in fact, the analysis of contact pattern effects might be of considerably greater importance than the assessment of individual risk factor effects.

Discrete models have recently demonstrated the importance of contact patterns in determining the level of infection in a population in ways that are not possible for continuous compartmental models. For example, Ghani et al (9) used network measures from discrete individual model simulations to demonstrate that these network measures are highly predictive of infection level independent of risk factor levels in the population. Kretzschmer and Morris (11-13) and Welch et al. (14) demonstrated that concurrency of partnerships can quite dramatically increase infection levels.

When we sought to translate our preferred (1), structured (2), and selective (3) mixing formulations into the discrete models we will present here, we discovered many different ways that these mixing patterns can be formulated. Furthermore, we observed that these different formulations alter population infection levels. For that reason, we sought mixing process that on the margin can give patterns consistent with preferred, structured, or selective mixing but that used more interpretable parameters and realistic processes.

Partner Based Gonorrhea and HIV Surveillance Systems

Objectives

The objectives of the surveillance system we are developing both in Genesee county and in the computer are to

The first objective is important because hidden epidemics still occur for both HIV and STDs. They are especially important in the case of HIV where early infection is often undetected and large chains of transmission can be built up before any individuals in the chain have symptoms. Using compartmental models, we have shown that outbreaks of transmission during early HIV infection can indirectly generate many more total infections in a population than can transmission late in infection (10). This may be the case even when the number of individuals to whom infection is directly transmitted is greater during the later stages of infection. We believe that outbreaks of transmission during early infection are common and that such outbreaks have varying duration and ever changing geographic and social settings. Our simulations are well designed to help define the conditions where this is or is not true.

The likelihood of such transient outbreaks makes the type of surveillance we are developing particularly needed. Currently outreach is targeted to the obvious places like gay bars. It is often ineffective in high school settings. We believe that our surveillance activities will reveal many unsuspected places to which outreach should be directed. We also believe that the documentation of transmission problems where partnerships are initiated within schools will enable the organization of more effective outreach to those settings.

Even when outbreaks are not detected, partner based surveillance can give an idea where undetected infections are most likely. This is an important way that partner based surveillance differs from standard surveillance. Standard surveillance only tells you where infections have been detected in the past. It cannot tell you where infection is going undetected. When partner data is available, analyses of expected numbers of infection in partners with specified characteristics could reveal when such expectations are out of line with the observed number of infected individuals. For example, if it is found that many men make partnerships with women in a certain locale but that no women with detected infections report partnerships in this locale, then it is likely that infections in such women are going undetected. That would be reason for organizing special outreach to such a locale. There is another way that partner data can help define where infection is going undetected. When identifying information on partners is available, analyses of the patterns of linkage between individuals could indicate where linkage frequencies are low due to many undetected infections. We have chosen to concentrate on the first approach in our fieldwork but our simulations will evaluate the relative effectiveness of both approaches.

With regard to prioritizing contact tracing efforts, we expect our surveillance data to provide indications as to which partnerships are most likely to have been sources of infection or to initiate new chains of transmission. To provide these indications the contact tracing interviews must gather the same data that is used within the surveillance system. Rules for deciding what patterns of social settings and individual characteristics indicate that a particular partner deserves priority will be explored in the simulations that we are developing.

The final objective is one that will become more feasible as the system we are developing at the County level is expanded to the State level. In some areas, expanding the geographic and political level of surveillance system organization could be important simply because many partnerships are made in social settings outside of the jurisdiction in which care is sought. Expanding the jurisdiction to higher levels may also increase the value at the local level, however, by providing better comparative statistics for the decisions that the surveillance system seeks to direct.

 

Surveillance System Design

To achieve our surveillance system objectives, individuals with gonorrhea or HIV infection are interviewed about their partnerships and the collected data is then routinely analyzed. STD and HIV control programs almost everywhere in this country collect partner information for contact tracing. That information, however, is not routinely analyzed for surveillance purposes. Partner data for contact tracing involves sensitive information many clients are unwilling to reveal. Consequently, the collection of partner data is often incomplete. In Genesee County, we seek less threatening information that does not identify the partner. We have taken different tactics for gonorrhea and for HIV. These tactics are adapted to the administrative and political environments where control of these agents takes place.

The first difference in our gonorrhea and HIV systems involves the personnel collecting the data. For gonorrhea surveillance, the same personnel collect partner data for contact tracing and for surveillance purposes. For HIV, separate personnel are involved for gathering these two types of data. Although the same people are involved in collecting these two types of data for gonorrhea, different tactics are used in their collection. In the initial part of the epidemiological interview for gonorrhea, only the non-identifying partner information used in the surveillance system is gathered. Then the partnerships discussed in this general and non-threatening manner are revisited to address issues relevant to contact tracing. This collection of non-identifying and identifying information in the same interview requires a tolerant and flexible approach on the part of the interviewer to insure that the threat of collecting identifying information does not inhibit the collection of non-identifying information. Such an approach seems nearly impossible for HIV. The interview is too charged with emotion and legal mandates get in the way of flexible approaches. For HIV, Michigan law mandates the collection of contact tracing information by health department personnel. Consequently, we have selected HIV case management teams to collect the surveillance data. They establish more trusting relationships with the client and can better assure those clients that the non-identifying data is truly non-threatening.

The questionnaire structure for collecting data is different for HIV and gonorrhea although the final data has similar content. For gonorrhea, the interview proceeds from partner to partner. For HIV, the interview proceeds from social setting to social setting where partners are met. Because HIV clients often have many partners that they meet in the same social setting, this makes it easier to determine number of partners with different characteristics or with whom different interactions occurred in each setting.

 

Using simulations to design and evaluate the surveillance systems

There are many questions to be resolved in designing partner based STD and HIV surveillance systems. We organize these into five sets of issues:

 

How much benefit is possible

The type of surveillance system we propose will be more useful where infection outbreaks are transient and pop up in a variety of places. When some transient outbreaks play key roles in the transmission system, then discovering and controlling those outbreaks will have indirect effects on individuals beyond the groups where transmission is occurring at the moment. On the other hand, in transmission systems where the sources of infection are well known and unchanging, the surveillance systems we propose will have less value. Comparing control efforts that are optimally targeted to control efforts using the same amount of resources that are not targeted provides a measure of the total potential benefit that targeting can have. We will examine different transmission systems to determine what characteristics lead to bigger differences between targeted and untargeted control efforts. That will help us determine both the locations and diseases that should be prioritized for the establishment of surveillance systems.

Prioritizing control efforts may have different degrees of benefit at different administrative levels. At the local level prioritization will occur mainly in terms of where outreach teams will be sent, what special efforts will be made to interview patients seen in different types of clinics, and what relative effort will be spent on locating and treating different types of named contacts. The surveillance system we propose, however, could have considerable value at the State level even if it does not have great utility at this level. Analysis of surveillance data at the State level could reveal the administrative districts with the most chance of having many undetected infections. Such comparisons might also be useful at the national level to indicate the States where more resources for surveillance and control could be most beneficial.

 

What data is appropriate

In any surveillance system, the data to be gathered must be minimized so that it is administratively possible for individuals with many other routine activities to collect the data. The data collection must also be simplified so that it will be clear to a wide diversity of people collecting the data exactly what is needed. These needs must be traded off against the value of additional information. We have chosen to begin field operations with an absolute minimum of data. This includes setting of partnership formation in a relatively short recent period, age group, race, and gender. Besides believing that partner histories over longer intervals might provide additional value, we think that more information on the actual risk behaviors such as condom use, on courtship intervals, and on more characteristics of the partners could be useful. We plan to conduct simulations to evaluate this.

 

What data analysis methods are appropriate

We are planning two levels of data analysis methods. One level involves calculating expected numbers of infections in partners and comparing these expected levels to the frequency of infection detection in different groups. Standard methods are used for this objective. As more data becomes available across broader administrative districts, such data analysis should gain considerable value. We will conduct simulations to assess how much value is gained by using different methods that draw on broader sets of data.

A second level of analysis involves the use of surveillance data to provide insights into contact network structure and thus provide indications as to what partnerships might be most key in that structure. This involves a whole new approach to epidemiological data and analysis. We do not expect to have the methods for this approach fully developed for several years. We plan to build these new analytical methods around the estimation of ancestor counts. Later in this document we will discuss these counts.

 

How should control policies be linked to the analytic results

The issues here are whether certain activities should be triggered by certain calculations and if so, how should the criteria for the triggers be set. In general, we feel that the main use of analytical results should be to provide understanding that influences control decisions. But in some cases, it could be useful to establish specific criteria to trigger certain activities.

For example, a specified number of people infected with syphilis that report making a partnership at a setting might be used to trigger the establishment of an outreach screening effort among people that frequent that setting. Such a criterion might be a function of the infection level in a population. As the infection rate goes down the yield from screening efforts goes down. Currently that causes some public health people not to undertake screening efforts when prevalence is low. They have a criteria for undertaking active screening activities that says if the yield is likely to be below a certain level, a screening program is not worth undertaking. Such criteria only consider the direct effects of infection detection. But if the rate of infection is very low, the chance that chains of transmission are cut off at the root by detecting infection increases. That means that although special screening programs will detect fewer new infections when infection levels are low, longer chains of infection can be cut off and action with small yields now might prevent significant numbers of infection in the future. Deciding on what yield from special screening programs constitutes a worthwhile endeavor is not a natural and intuitive process for many health officials. We will conduct computer experiments that will help lay out more clearly the criteria for optimal decisions regarding when special outreach screening programs are indicated.

 

How can the effectiveness of the surveillance system be evaluated

The most thorough analysis of surveillance system effectiveness in achieving infection control is the SENIC project. In this project, nosocomial surveillance strategies were related to levels of infection experienced in different hospitals. Usually, however, surveillance systems are only evaluated in terms of quite proximate process measures such as the fraction of infections that are detected. We believe that by using simulations for the design of surveillance system procedures, we will at the same time be establishing a basis for evaluating the surveillance system. Our model structure allows for detailed simulation construction that can correspond reasonably well to real world situations. That should facilitate construction and analysis of simulations to see how much disease control should be possible using the surveillance system procedures adopted and what sort of process measures should be consistent with the most effective levels of control. Then by comparing the observed process measures to the ideal ones, one can assess how far one has come along toward achieving surveillance system objectives. The process measures we could explore in this regard include the stages at which infection is detected, timing between detection of related infections, and the measures of imbalance in the settings of reported partnership formation.

GERMS MODEL FOR POPULATION AND INFECTION DYNAMICS

Rather than using an acronym, we named our modeling and simulation structure "GERMS" without tying it to an acronym. GERMS is a microsimulation of individuals with specific geographic and social locations that enter into population contact processes and maintain their history of simulated contacts and infections. This allows for the incorporation of surveillance systems and control programs such as contact tracing. In addition, the model is designed so that for special cases closed form solutions and/or continuous compartmental model calculations can determine a number of relevant quantities, such as the pseudo-equilibrium prevalence (15). This allows for verification of correct computer implementation of the model, for efficient exploration of parameter spaces, and for formulation of basic theory relevant to transmission system behavior.

GERMS explicitly models:

Formal Mathematical Specification

The model generates a network-valued stochastic process. Each node in the network represents an individual in the population. Individuals and the places they meet each other have geographic and social locations. But the arcs connecting individuals are not determined by geography. They arise through a mixing process that can be influenced by geography but that has a dynamic that can establish an arc between any two individuals in the population. Arcs representing sexual or other types of contact between individuals are added or removed through time as relationships between individuals are formed and dissolved.

Identity of Individuals

Heterogeneous populations are specified by assigning a unique set of parameter values to each individual. This can be done either by treating each individual separately or by defining social and "edit" groups with specified distributions of various individual characteristics and then randomly populating these groups with individuals drawn from those distributions. Some of the parameters assigned to individuals include:

Additional parameters include probability of experiencing symptoms if infected, the probability of seeking treatment if symptoms are experienced, the probability of reporting a given partner to medical authorities when reporting is requested; and others.

During a simulation run, information on individuals is maintained with them as variable values. These include

  1. the identity of each partner, if any,
  2. the identity and timing of the most recent partners, and
  3. the most recent infection and recovery times, if applicable.

Contact Processes

Contact processes determine the manner in which partnerships are formed and dissolved through time. Contact patterns are determined in GERMS by distributing each individual's partnership potential in "social settings", known in GERMS as "bins". Each bin has both geographic (Cartesian coordinates) and social coordinates (e.g., only individuals from certain social groups can form a relationship in a given bin). The structured mixing models of Jacquez et al. (2) motivated the bin concept. Unlike structured mixing formulations that only allow for assortative mixing, however, the use of two-sided bins allows for a rich combination of assortative ("birds of a feather") and disassortative ("opposites attract") mixing patterns. The extent of assortative mixing can be increased by defining increased numbers of one-sided bins and narrowing the characteristics of the people that can go to those bins. The extent of disassortative mixing can be increased by defining increased numbers of two-sided bins where assignment to one side of the bin or the other is determined by the valuable of the disassortative variable.

Each individual splits his or her partnership potential amongst one or more bins, so that a fraction, fij,of individual j's partnership potential that is available in bin i. The values of fij can be determined by the "edit" group assignment of individuals. For example, individuals destined to be part of a "core" group will have different fij than individuals who are not part of a "core" group. The fij will be positive for bins that allow an individual to enter, and may be greater for geographically "close" bins. Great flexibility for assigning the fij in socially realistic or meaningful patterns is attained by defining simultaneous social and edit group parameters. The details of doing this are beyond the scope of this presentation. The partnership potential for individual j depends on their base partnership potential when single, lj, their damping factor qj, and their current number of partners nj. The partnership potential of individual j in bin i is then defined as follows.

.

When qj = 0, an individual will be monogamous in their partnerships. Increasing values of qj up to one allow for increasing degrees of concurrent partnerships.

The rate of partnership formation between two individuals is a direct function of their xij values. When either only one sided bins (discussed below) are modeled or when populations in two sided bins have the same sum of xij values on each side of the bin, then the xij correspond to the expected values of the realized partnership formation rates.

The rate, rijk, of partnership formation between individuals j and k in bin i is defined as follows.

,

assuming that relationship formation is permissible given monogamy constraints and rijk = 0 otherwise.

The decision to define rijk in terms of the arithmetic mean of individual partnership potentials was motivated by three considerations

In GERMS, partnerships form between explicitly modeled individuals and persist over a finite time span. This time span is divided into two phases, the courtship phase and the relationship phase. The relationship phase is defined by the onset of sexual activity and is the period during which infection transmission can occur between two partners. The length of each phase has a Gamma distribution.

Partnership duration and risk behavior

The duration of a partnership is chosen randomly from a Gamma distribution that is specific to each bin. For initial analytic purposes described below, we choose a shape parameter of 1 for the Gamma distribution so that duration corresponds to a negative exponential process with rate s. The rate of risk behavior (h) is also drawn from a distribution that is characteristic of each bin. The transmission probability per risk act is labeled f and does not vary by site of partnership formation. It can vary across both infected individuals and individuals in partnerships with infected individuals and may be a function of stage of infection or gender. Transmission can occur only within the context of a relationship between two individuals.

Discrete-Event Simulation Design and Implementation

GERMS models a population of individuals who mingle in various activity settings, form sexual partnerships that persist over time, and who may infect or become infected by a partner. These processes of partnership formation and infection, along with some of their algorithmic complexities, are described below.

Partnership Formation Time

Partnership formation events are sampled by first determining a time of next partnership formation, then determining the bin in which the partnership will be formed, and finally determining the specific individuals in that bin who will form the partnership. Since the rate at which partnerships are formed changes through time as other partnerships are formed or terminated in the population, partnership formation times are randomly sampled in accordance with a non-homogenous Poisson process. The implementation essentially inverts the cumulative of a non-homogenous Poisson process that has its parameters changed at each partnership formation and breakup event.

A potential partnership formation time is sampled at the start time, ta, of every partnership. At that time, each bin is queried for the rate at which partnerships are forming in it and these rates are summed to yield an overall partnership formation rate, R1. The time, , until the next partnership begins is then sampled from an exponential distribution with mean and the next partnership is scheduled to begin at time . A problem with ts arises if another partnership ends at time . When a partnership ends, the partnership formation rate increases in each bin in which the partners circulate, yielding a new overall rate, R2. Since was generated for a random process having rate R1, it would be incorrect to schedule an event at ts when the process now has rate R2. Therefore, ts is re-scaled by the ratio of the old and new rates as follows.

A similar type of re-scaling occurs for each partnership that terminates prior to ts. While each individual re-scaling might be small, without this adjustment the cumulative effects can cause the xij to differ significantly from observed contact rates in situations where they should correspond to observed contact rates.

Randomly Sampling Partners

The probability of partners being sampled from a particular bin is proportional to that bin’s contribution to the overall partnership formation rate, R. Once the bin is determined, the partners must be sampled. In the discussion that follows, only a two-sided bin is considered. The two-sided bin allows for disassortative mixing of many types including heterosexual partnerships. It also allows for assortative mixing with regard to aspects of individuals that do not determine which side of a bin they enter. An analogous process can be constructed for one-sided bins for homosexual partnerships.

The probability of two people being sampled from a given bin is proportional to their xij. Partners are sampled by sampling the first person, p1, then sampling the second person, p2, conditional on the identity of p1. In principle, this process requires checking each pair of individuals in the bin who are not already in a partnership with each other to determine the rate at which they will form a partnership with each other. Once the set of potential partners is determined for p1, the probability distribution over the set must be calculated and the second partner sampled.

An exact implementation of the process described above has a computational time complexity of per partnership formed, where N denotes the population size. Because there are are partnerships per unit simulated time, the overall time complexity of the simulation is . A more computationally efficient approach is needed. The probability, pij, of selecting person j from bin i , for individuals j that can form partnerships in bin i, can be approximated as follows:

where wi1 is the number of people on side one of bin i that are not, excluding partnered monogamous people. This approximation is exact for entirely monogamous populations. However, it becomes less accurate as the fraction of polygamous individuals increases because it counts as compatible those people on side two with whom person j is already involved. An overall simulation time complexity of is then achieved by combining this approximation with (1) a balanced binary containing a node for each person, and (2) a rejection sampling method to reject partnerships that are incompatible, but were inappropriately sampled because of the approximation. The number of rejections until a compatible partner is found is geometrically distributed and for large populations and/or low partnership formation rates, a partner is quickly determined.

Note that our formulation adds computational complexity in order to generate a process for which expected values can be calculated given specified conditions. Our previously published simulation model (16) has greater computational efficiency but like most transmission simulation models we have seen, it does not have analytical tractability. For some purposes, such as exploring the conditions to achieve eradication, it may be desirable to simulate very large populations. In that case, analytical tractability may have to be sacrificed and our analytically tractable models might be used to provide some validation for models that can handle larger populations at faster speeds.

 

Infection Transmission and Natural History

GERMS currently implements what is termed a Susceptible-Infected-Susceptible (SIS) model of infection transmission. In such a model, individuals are susceptible to infection, become infected or colonized and subsequently are rid of the organism returning to the susceptible state. The modeled transmission mechanism is sexual contact. We believe that to model gonorrhea in a manner that will enable transmission system analyses to lead to an understanding of transmission dynamics that will permit effective control of this organism, we will have to incorporate acquired immunity and agent diversity into our model. But for initial explorations to be performed regarding surveillance system performance with this model, we incorporate this more traditional view of the natural history of gonorrhea infection.

When a partnership enters its sexual activity phase, the possibility exists for infection transmission. If exactly one of the partners is infected, a potential time to transmission is sampled from an exponential distribution with mean depending on the rate of contacts and the probability of infection per contact. An infection event is scheduled if the resulting transmission time occurs during the partnership’s relationship phase and before the infected partner is scheduled to recover.

When a person’s infection begins, each of the person’s partners is examined. If a partner is not already infected, then a potential time to transmission is sampled and conditionally scheduled as described above. Finally, the potential exists for re-infection from an infected partner when an infection clears. At that time, each of the newly cured person’s partners is examined and if a partner is infected, then a potential time to transmission is sampled and conditionally scheduled.

In the near future GERMS will be expanded to include additional notions of infection surveillance and intervention. If an infected individual is cured by some external intervention, such as the administration of antibiotics, then all scheduled transmissions for which that individual is responsible are cancelled.

Simulation Engine and Operational Issues

Previous experience (15,16) suggested that an object-oriented design is well suited to the type of infectious disease modeling being explored here. Other requirements for a simulation package included a complete programming language, the ability to link in C/C++ objects, a mature code base and the availability of technical support. After reviewing both commercial and freely available packages, MODSIM III from CACI Products Company running on Microsoft Windows NT 4.0 was selected.

The experience with MODSIM III has thus far been positive. Technical support has been adequate and no problems have been discovered in the MODSIM III code base. Another desirable feature of MODSIM III is that one specifies processes rather than individual events. Partnership and infection life cycles are naturally expressed with this abstraction. A minor deficit in using MODSIM III is that time and space optimization efforts have been hampered by the lack of suitable profiling tools.

GERMS is run from the command line, reads simulation parameters from several input files and writes several output files. The input files specify the characteristics of the infection, people and bins. Other input parameters allow GERMS to be placed in various special modes that facilitate testing and the construction of regression test suites. The output files optionally contain a record of all simulation events and a report containing measures of interest collected at regular intervals during the simulation run. In addition, the complete state of the simulation is saved at the end and can be read in for a later run. This allows one to run GERMS for a burn-in period, save the state, and start future simulation experiments from this state.

Population Editor Interface

Previous experience and early analysis requirements for the current work indicated that a graphical input file generator would be useful. Such a tool would allow for easier creation and visualization of a population (and the associated input files) on which to base a set of simulation runs. The Population Editor Interface was created for this purpose.

The Population Editor allows for the easy creation of an arbitrary number of bins and sub-groups of individuals. Using a point-and-click interface, bins can be precisely placed on a geographical map and their attributes (e.g. partnership potential for individuals circulating in the bin) modified. Similarly, groups of individuals can be created with similar attributes (i.e. gender, sexual preference, etc.) and randomly distributed over the map. Clicking on a specific individual displayed on the map pops up a dialog box containing the values of the individual’s attributes. This dialog box allows one to fine-tune attributes of individuals.

Several features make it easy to identify individuals or groups of individuals of interest. A "zoom" feature makes it easy to zero-in on individuals in a given geographic area. For identifying groups of individuals with similar attributes, the Population Editor allows the construction of simple queries against which the population is searched. Individuals matching the query can be color-coded for easy identification.

Figure 2:

A screen shot of the Population Editor Interface after 2000 individuals (small dots) and 10 bins (small circles) are defined.

In addition to allowing for the specification of bins and populations, the Population Editor also allows one to specify parameters associated with the infection process; i.e. infection duration, probability of transmission per sex act, and recovery rate. Finally, the Population Editor allows one to specify the number of reporting intervals and the duration of each interval.

The Population Editor Interface was developed using Microsoft Visual Basic 5.0 on the Microsoft Windows NT 4.0 platform. The ability to rapidly prototype graphical interfaces was a main consideration in the choice of Visual Basic. Elaboration of this interface is planned to facilitate the use of various design of experiment approaches that allow for optimal sensitivity analysis with the least number of simulations.

Analysis of Reinfection dynamics

The model we have just presented is significantly different from compartmental SIS models where all compartments segments of populations whose basic unit is individuals. One difference is that during a relationship it is possible for an infected partner to re-infect an individual after they have cleared an infection. Thus, it is possible for a "paddle-ball" infection pattern to occur where one clears an infection but because a partner remains infected, one becomes re-infected. Infection can also be passed back and forth between partners in a "ping pong" fashion. We have mathematically analyzed the effects of re-infection in our model. This analysis is useful both for gaining insights into transmission dynamics and for validating our computer code. We now discuss our analysis of endemic states. Next, we discuss concepts of R0 related to our model. Then we present additional validation that is possible through the use of compartmental models that formulate compartments where the basic units are pairs of individuals rather than individuals.

 

Endemic levels given reinfection after recovery during partnerships

We now derive some infection prevalence measures given reinfection for the simplified SIS model. We are interested in pseudo-equilibrium levels of prevalence (15). We examine here the simplified case where all partnerships are monogamous, where there are equal numbers of males and females, and where Markov assumptions hold so that times to events like next partnership formation, end of infection, and end of partnership are memoryless and thus have a negative exponential distribution. Since the prevalence in the partnered and single populations is not the same, we introduce the following notation in Table 1:

Table 1

Notation for analyzing endemic states

Notation

Meaning

pu

Fraction of single individuals that are infected the instant before starting a relationship.

pp

Fraction of partnered individuals that are infected the instant before a relationship terminates.

pf,j

Fraction of partnerships that terminate with exactly j infected individuals (j = 0, 1, 2)

gi,j

Fraction of partnerships that terminate with j infected individuals, assuming that the partnership started with exactly i infected individuals

The transitions for monogamous partners in these terms are shown in Figure 2. Infection lasts d time units and durations have a negative exponential distribution. Partnerships break up at the rate s. Potentially transmitting contacts are made at the rate f and transmission probabilities per contact are h.

Figure 3

Transitions in the model analyzed for endemic states and reproduction numbers

We will derive formulations for the pp and pu in terms of basic model parameters. We do so by relating these two entities in two different ways so that we have two equations for these two unknown prevalence values that we can then solve.

First, someone infected at the end of a relationship is infected at the start of the next relationship with probability x/( x+1/d). This leads to the relation

We now establish a second relationship between pu and pp by analyzing the infection process during a relationship. To determine the pf,j, the values of gf,j are required, as well as the probability that a relationship starts with a given number of partnerships. Assuming uniform mixing, the probability that a partnership starts with 0 (1 and 2, respectively) infected individuals is (1 - pu)2 (2 pu (1 - pu) and pu2, respectively). The fi,j are readily determined from conditioning on the next event (infection, recovery, or break-up), whose continuous rates are illustrated in figure 3. The rates are readily translated into the following relationships:

These relationships imply that:

Similarly,

By noting that , it is easily verified that , as required. The fraction of infected individuals at the instant a partnership is broken up is then:

where

Combining terms:

Recalling that and collecting terms leads to

This equation has two roots. The first is the stationary prevalence, pu = 0 , and the other leads to the pseudo-equilibrium prevalence, assuming the value is non-negative:

Given the strong Markov character of this model, the pseudo-equilibrium prevalence can be found to be

Confirming that simulations reproduce theoretical relationships

A major advantage of GERMS is that closed-form solutions are available for simplified input parameter settings. We used the above theoretically derived relationships to conduct simulation experiments on test populations created with the Population Editor to test whether or not the simulation response indeed corresponded to theoretical values. We describe here some results that confirm correct implementation of the population dynamics and SIS infection process.

For the simplified test populations, we assumed a closed population (no recruitment or departures) of N=1000 males and N=1000 females that have potential only for monogamous relationships in a single bin. All individuals have the same rate x=1/14. Partnerships break up at rate s=1/14 days, the rate of contact during a partnership is f=3/7 (3 per week), with per-contact transmission probability h=0.3 when exactly one partner is infected. The infection duration is exponentially distributed with mean d =55 days, a value that is reasonable for untreated gonorrhea.

For all stochastic transmission models, the state of the model with no infected individuals represents a sink so that the stationary distribution has 0 prevalence. However, in our model the stationary zero prevalence level may take an extremely long time to reach. Infection levels hover around a pseudo-equilibrium endemic level of infection for extended periods of time (15). For these experiments, the pseudo-equilibrium level is the prevalence level such that a newly infected individual infects one additional person, in expectation. Building on the math from the previous section, we multiplied pp by the expected fraction of individuals in partnerships at equilibrium (x/(x+s) times 2 times the rate they are breaking up to get a daily incidence. Since we were collecting data from the simulation at 30 day intervals we multiplied this daily incidence times 30 to correspond to our simulation collected data. This has the value

or

people infected at the time of partnership breakup, per 30 day period, assuming the units of rates are per day. This then provides a theoretically derived value that we can compare to the value observed in the simulation implementation of our model.

Table 1 summarizes the first set of initial experiments with the above parameters, as well as with some variations on the parameters. Variations on the base case allowed for variations on one parameter individually. Simulation analysis had the population mixing for 1 year, infection was introduced into the population after one year, and initial analysis indicated that the initial transient was approximately an additional 3-5 years. Simulation statistics are based on batch mean analysis with 16 batches of 1 year, and give a 95% confidence level/credible set using a t-statistic approximation. The analysis was roughly the same for 32 batches of 6 months. Statistical tests for the independence of batches indicated that most often the correlation between batches is not significant, but that there may be a potential positive correlation between adjacent batches for the Base Case, so the CI may be slightly overconfident for that case.

Table 2:

Theoretical and Simulation Estimates for Number Infected at End of Partnership, Per 30 days when the system has come to pseudo-equilibrium

Experiment

Our Theoretical

95% Observed

Base Case

710.6

(686, 737)

Base, but x=1/7

1692.4

(1664, 1705)

Base, but s=1/21

572.7

(542, 587)

Base, but h=0.5

1228.0

(1203, 1235)

Base, but d=50

457.3

(454, 504)

All the theoretical values fell within the interval estimates from simulation, and therefore supported the assertion that the coding and the analysis are correct. In the last column, we note that the interval is not very centered around the predicted. That could be due to chance but we are exploring whether it is due to the fact that the system had not yet come into equilibrium. As can be appreciated from table 2, the system is markedly sensitive to the duration of infection. Kretzschmar and Dietz have commented upon similar sensitivities with regard to HIV transmission (17).

 

Confirmation by continuous compartmental model analogues

The analyzability of our stochastic model formulation was further demonstrated by showing that the theoretical counts predicted on the basis of the above mathematics correspond precisely to the equilibrium values of continuous deterministic models where populations of individuals not in partnerships and populations of partnerships are separately modeled. The continuous deterministic model corresponding to the above stochastic model is as follows:

Differential equations for individuals

where

Irs is the number of individuals of gender r (m or f for male or female) in infection state s (i or u for infected or uninfected)

Pmf is the number of couples where the males (indicated by the first subscript) and females (indicated by the second subscript) have infection status i or u for infected or uninfected

s is the rate of couple dissolution

x is the rate of partnership formation (partnership potential in non-homogeneous populations.)

d is the duration of infection

The four equations are for males and females in infected and uninfected states. In each equation the first term represents individuals coming from the break up of partnerships. The second term represents individuals forming partnerships. There are no new infections in the population of individuals because transmission only occurs during partnerships. Infected individuals, however, do revert to the uninfected state via the last term in the equations.

Differential equations for partnerships

where

h is the transmission probability per sexual contact in a partnership

f is the rate of sexual contact in a partnership.

The terms in the first parentheses in each of these equations could be simplified in a model like the one under consideration where there is exact equality between males and females. But we separate them to emphasize that the two terms in those parentheses must be equal. When the number of single males in infected and uninfected states does not equal the number of single females in those states, then the value of x must be calculated separately for males and females to balance the two equations. Our use of the additive balance in our simulation facilitates the calculation of a similar balance in differential equation models like this. The next term represents the break up of partnerships. Subsequent terms represent the loss of infection in partnerships that causes their status to shift between the four compartments. The final term in the two cases where only one partner is infected represent the only place in the model where infection is transmitted.

The equilibrium values for the number of infected individuals at partnership break up over 30 days were found by numerical analysis of the above equations to be exactly the same as values in the above table derived through analysis of the stochastic process performed above. The same is true for the equilibrium prevalence. The equilibrium values of prevalence and incidence of infection by gender in our simulations were found to contain the values predicted by this deterministic model just as they did the predicted number of infected individuals at partnership break up. This is shown in Table 3.

Table 3

Comparison of stochastic and deterministic model results for a heterosexual pairing model with constant, homogeneous populations and rates

Experiment

Continuous model endemic prevalence

Observed mean prevalence

95% confidence intervals

Base Case

596.0

596.37

(576.47, 616.27)

Base, but x=1/7

1140.1

1139.32

(1125.54, 1153.10)

Base, but s=1/21

613.9

605.98

(582.83, 629.12)

Base, but h=0.5

1029.8

1025.58

(1014.58, 1036.59)

Base, but d=50

380.2

398.52

(378.28, 418.76)

The patterns are very similar to the number of cases at break up of partnerships. Again, the sensitivity to duration of infection is notable. It is also worth noting that the prevalence goes up as the partnership duration increases although the number of cases at break up of partnerships goes down with this change. That is because the fraction infected of paired individuals must always be greater than the fraction infected of unpaired individuals. The fraction infected in both paired and unpaired groups goes down as partnership duration is increased, but the increase in the fraction that is paired is still enough to raise the overall prevalence.

So that the relationship between prevalence and number of infected individuals at partnership break up can be appreciated, and so that variance over time can be appreciated, we present both values in the following graph. The prevalence values are the average over the 30-day period and thus are somewhat less variable than the counts at ends of partnerships. The tracking of these two measures, however, is evident.

Figure 4

Simulation Output Aggregated at 30 day intervals for a model of monogamous partnerships, homogenous populations with equal numbers of males and females having identical partnership potential, and exponential rate processes

 

Some behavior of this transmission system

The effects of additional factors on infection levels are examined in Table 4. Notable observations in this table are the sensitivity to the form of the distribution for mean duration of infection observed in comparing runs C and D and the marked sensitivity to concurrency of partnerships seen in E through G. With a concurrency damping factor value of 0.1, the total number of partnerships in the simulation increases by only 10%. This, however, doubles the frequency of infection at the end of partnerships. When only 10% of the population has polygamous relationships at a level that increases the total number of partnerships in the population by only 1%, this increases the frequency of infection at the end of partnerships by 24%. If this 1% is located such that polygamous individuals are more likely to encounter each other, as per the conditions in G, the increase is 29%. We have not conducted detailed analyses of the model to assess sensitivities more formally as we will later. These preliminary examinations were conducted mainly as tests to see how the model was functioning.

Table 4

Endemic Infection levels for various model modifications

Description of run

30 day infected count at breakup

95% CI (t-stat)

Observed Mean Prevalence

95% CI (t-stat)

Base Case$

711.45

(685.8 , 737.2)

596.4

(576.5, 616.3)

A) Base case, except

- 900 Males

688.45

(668.0 , 708.9)

578.9

(561.9, 595.8)

B) Base case, except fmale to female = .7 ffemale to male = .3

1106.24

(1095.4 , 1117.1)

927.5

(918.9, 936.1)

C) Base case except duration of infection. Male: exp(30) Female: exp(45)

80.37

(66.38 , 94.36)

64.3+

(53.55, 75.12)+

D) Base case except

Distribution of duration of infection Male: Gamma(2, 30) Female: Gamma(2, 45)

Goes to zero

N.A.

Goes to zero

N.A.

E) Base case except

everyone is polygamous (dampening factor = 0.1)

1423.52

(1405.6 , 1441.4)

994.31

(983.5, 1005.1)

F) Base case except

100 males polygamous (dampening factor = 0.1)

100 females polygamous (dampening factor = 0.1)

879.59

(859.1 , 900.0)

712.51

(697.1, 727.9)

G. Same as F., except polygamous are equally in 2 bins and monogamous are in only 1 bin

915.42

(895.0 , 935.9)

726.48

(712.5, 740.5)

H. Base case, except

2000 males, 2000 females 2 bins 1000 males, 1000 females in 1 bin 1000 males, 1000 females in other bin

1413.37

(1383.9 , 1442.8)

1185.04

(1163.1, 1206.9)

 

Basic Reproduction Numbers

We derived the pseudo-equilibrium levels of infection above by analyzing stochastic processes involving individuals. Pseudo-equilibrium levels have also been estimated in other contexts using the formula 1-1/R0. We therefore carried on our analysis to determine R0 when reinfection is possible.

Note that basic reproduction numbers can be derived as characterizing individuals or populations. Only under homogenous conditions do the two characterizations coincide (3). In the base case considered above, heterogeneity arises only because we have some infected individuals in a single state, some are partnered with uninfected individuals, and some are partnered with other infected individuals. With only this degree of heterogeneity, we can still calculate basic reproduction numbers but we must specify these three conditions. We therefore make the following definitions

Table 5

Definitions of different individual based basic reproduction numbers

Notation

Meaning

X or R0|X

The expected number of new infections directly caused by an infected individual given that they have just ended a relationship.

Y or R0|Y

The expected number of new infections directly caused by an infected individual given that they just started a relationship with an uninfected individual.

Z or R0|Z

The expected number of new infections directly caused by an infected individual given that they just started a relationship with an infected individual.

By conditioning on the next state transition associated with a given individual (partnership formation, recovery, infection of partner, etc.), and assuming that pu is a good representation of the fraction of infected individuals that are single, the following relationships can be derived:

Substituting Z into the equation for Y gives:

Therefore,

and algebra indicates that

We can substitute Z now into the equation for Y to obtain:

or:

which simplifies to

The above expressions for Y and Z can be substituted into the equation for X to get:

Collecting terms gives:

This simplifies to:

and further simplifies to

Note that as hf approaches infinity (implying certainty of transmission) and that pu = 0 (essentially nobody else is infected) then X becomes x(ds+2)/( x+s+1/d). Note that this exceeds the analagous result of [x(ds+1)/( x+s+1/d)] when reinfection is not allowed. Thus under a wide range of conditions it is quite important that stochastic models of transmission include the possibility of reinfection in the course of ongoing contact with an individual. We note that several stochastic models seem not to do this but that documentation on this issue is often inadequate to specify how the model treats the potential for re-infection.

In our restricted model with exponential durations of infection and pure monogamy, these R0|X, R0|Y, R0|Z values correspond to someone coming into the population from the outside. Subsequent cases arising within the population would only start in partnerships where the partner is infected and was in fact the infecting source. The values of these R0 for the initial set of conditions that we used to demonstrate the validity of our simulation implementation are shown below.

Table 6

Pseudo-equilibrium prevalence and conditional basic reproduction numbers at various parameter settings in a model with homogeneous and equal populations of males and females and exponential rates of partnership breakup and infection elimination

Experiment

Continuous Model Endemic Prevalence

R0|X

Unpartnered

R0|Y

Unininfected partner

R0|Z

Infected Partner

Base Case

596.0

1.43

1.79

1.25

Base, but x=1/7

1140.1

1.98

2.23

1.69

Base, but s=1/21

613.9

1.45

1.82

1.22

Base, but h=0.5

1029.8

1.75

2.19

1.53

Base, but d=50

380.2

1.31

1.68

1.14

The marked difference between the different basic reproduction numbers is remarkable. None of them separately determine the pseudo-equilibrium prevalence.

Developing network measures

We are developing new network measures from Time Directed Partner Graphs (TDPG). Unlike social network graphs where the nodes are individuals, relationships between individuals constitute the nodes of TDPG. We were stimulated to take this approach by the work of Morris and Kretszchmer (11-13). Our approach differs from theirs in that we use time relationships to establish directed connections. This allows us to better capture dynamics. Directed connections between nodes are defined when one individual is common to two nodes and the timing of formation and dissolution of those two relationships meet specified criteria chosen to reflect the potential for infection transmission. Our approach reduces to the Morris and Kretszchmer approach under specified timing rules.

The data from the simulation include when a relationship began and when it ended as well as the members of the relationship. Such data is presented in the table below where the timing data is presented graphically in the third column. There are various sets of rules we have devised with increasing degrees of relevant detail that we use for establishing links from such data. There are rules regarding the timing of various events that establish a directed link between relationships. In columns 4 and 5 of the following table we have constructed TDPG using "Pure subject" rules that are presented after the graph. Column 4 and 5 use a different parameter corresponding to duration of contagiousness in the graph construction.

Table 7 Figure 5

The data structure and construction of Time Directed Partner Graphs

Node #

members

Time 1111111111222222

1234567890123456789012345

Pd=6

Pd=11

1

A, B

|-------|

2

A, C

|-----|

3

A, D

|------

4

B, E

|-------|

5

C, F

|-|

6

C, G

|--|

7

D, E

|-|

"Pure Partner" linking rules used for the graphs in table 1

Points in time are specified as "Tx" and periods of time as "Px". The definitions and rules for linking relationship j with a directed arrow to relationship i are:

  1. "Tbj" is the beginning date of a relationship j.
  2. "Tbi" is the beginning date of a relationship i.
  3. "Tei" is the ending date of a relationship i.
  4. "Pd" is a fixed period with a length equal to the combined incubation period and duration of contagiousness.
  5. Relationship i has a directed link relationship j if the following two conditions are met at some simulation time when the two relationship share at least one member.
    1. Tbj < {Tei + Pd}
    2. Tbi < Tej

The adjacency matrix corresponding to the TDPG illustrated in Table 7-Figure 5 has entries consisting of 0 and 1’s with undefined entries along the diagonal (we have not yet established rules for reinfection). These matrices are the analytic tool we use.

The source count for a particular node ni is the number of other nodes from which that node can be reached in the directed graph. We have developed simple matrix multiplication methods to calculate this. The number of common ancestors within a specified number of generations can also be calculated using simple matrix multiplication routines.

In developing methods to use site of partnership formation and duration of courtship to provide information about network characteristics, we intend to first relate infection prevalence and the importance of particular individuals to network statistics based on source or ancestor counts derived from the TDPG. We then intend to explore statistics based on assumptions about the distribution of partners encountered at different sites and involving different courtship intervals to explore how those statistics will relate to ancestor or source counts. Finally, we will see how effective the best statistics can be in helping to reach the stated surveillance system goals.

Future Plans

We are planning various improvements to the speed of the simulations.. By adopting tactics that make closed form solutions or correspondence with deterministic models difficult or impossible, we can gain considerable speed. The analyzability, however, is a characteristic we will work hard to preserve. Infection detection mechanisms and surveillance systems are now being added. We expect to begin evaluating gonorrhea surveillance systems in about 6 months. Tsung-Wen Kuo will write his thesis relating these simulation evaluations to the fieldwork that he has done in Genesee County. Chris Riolo will pursue a thesis on how the time directed partner graphs can be used to both help assess the effects of contact patterns on transmission levels and define field practicable network measures using social setting of partnership formation and courtship periods. Lynnette Rodriguez will use the simulation to write a thesis on how transmission system conformation affects the robustness of epidemiological analyses to the violation of independence assumptions. We suspect that the robustness will very much depend upon transmission system conformation.

We very much want to make our simulations user friendly tools for all epidemiological theoreticians. We encourage collaborative projects using the model. We hope that experience with such projects will enable us to provide a version of the software to colleagues. A MODSIM III license will, however, be required to make any modifications to the code.

References

  1. Jacquez JA, Koopman JS, Simon C, Sattenspiel L, Perry T. Modeling and the analysis of HIV transmission: The effect of contact patterns. (1988). Math Biosci. 1988; 92:119-199
  2. Jacquez JA, Simon CP, and Koopman JS. Structured mixing: Heterogeneous mixing by the definition of activity groups. In Mathematical and Statistical Approaches to AIDS Epidemiology. Castillo-Chavez C, ed. Springer-Verlag Lecture Notes in Biomathematics. 1989; 83:301-315.
  3. Koopman JS, Longini IM, Jacquez JA, Simon CP, Martin W, and Woodcock D. Assessing risk factors for transmission. Am J Epidemiol. 1991; 133(12).
  4. Blythe SP. Castillo-Chavez C. Palmer JS. Cheng M. Toward a unified theory of sexual mixing and pair formation. Mathematical Biosciences. 107(2):379-405, 1991
  5. Busenberg S. Castillo-Chavez C. A general solution of the problem of mixing of subpopulations and its application to risk- and age-structured epidemic models for the spread of AIDS. IMA Journal of Mathematics Applied in Medicine & Biology. 8(1):1-29, 1991
  6. Castillo-Chavez, C. (ed.). 1989. Mathematical and statistical approaches to AIDS epidemiology. Lecture Notes in Biomathematics, 83, Springer-Verlag, Berlin, Heidelberg, New York, London, Paris, Tokyo, Hong Kong
  7. Castillo-Chavez, C. Busenberg, S., Gerow, K. (1991). Pair formation in structured populations. In Differential equations with applications in biology, physics, and engineering, Goldstein, J.A., Kappel, F. Schappacher, W. (eds.), Lecture Notes In Pure And Applied Mathematics 133. New York, Basel, Hong Kong, Marcel Dekker, Inc.
  8. Garnett GP. Anderson RM. Balancing sexual partnerships in an age and activity stratified model of HIV transmission in heterosexual populations. IMA Journal of Mathematics Applied in Medicine & Biology. 11(3):161-92, 1994
  9. Ghani AC. Swinton J. Garnett GP. The role of sexual partnership networks in the epidemiology of gonorrhea. Sexually Transmitted Diseases. 24(1):45-56, 1997
  10. Koopman JS, Jacquez JA, Simon CP, Foxman B, Pollock S, Barth-Jones D, Adams A, Welch G, Lange K. The role of primary HIV infection in the spread of HIV through populations. JAIDS and HR 1997; 14:249-258.
  11. Morris M, Kretzschmar M. Concurrent partnerships and the spread of HIV. AIDS 1997;11:641-648
  12. Morris M, and Kretzschmar M. Concurrent partnerships and transmission dynamics in networks. Social Networks; 1995;17:299-318
  13. Kretzschmar M. Morris M. Measures of concurrency in networks and the spread of infectious disease. Mathematical Biosciences. 133(2):165-95, 1996 Apr 15
  14. Welch G, Chick SE, Koopman JS. Effect of Concurrent Partnerships and Sex-act Rate on Gonorrhea Prevalence. Simulation. 1998;71(4):242-9
  15. Jacquez JA, Simon CP. The Stochastic SI model with Recruitment and Death: 1 Comparison with the closed SIS model. Mathematical Biosciences 1993;117:77-125
  16. Adams A, Barth-Jones D, Chick SE, Koopman JS. Simulations to evaluate HIV vaccine trial designs. Simulation. 1998;71(4):228-41
  17. Kretzschmar M. Dietz K. The effect of pair formation and variable infectivity on the spread of an infection without recovery. Mathematical Biosciences. 148(1):83-113, 1998