|UM SPH Home > Department of Biostatistics > Research|
Department of Biostatistics
Data on national health and epidemiology are often collected using complex probability sample designs involving stratification and clustering of units. Faculty in Biostatistics are engaged in the design of such surveys, and in developing methods of analysis. Topics include designs that combine information from samples and administrative sources, and robust analysis methods based on models that improve on standard design-based inferences, and deal with problems of missing data.
With the advent of Markov chain Monte Carlo methods in the late 1990's, Bayesian modeling and analysis has played an ever-increasing role in the health sciences and public health. Several researchers in the Department of Biostatistics are contributing to this growth, with important innovations in automated image analysis, in the analysis of ordinal and rank data, and in core statistical methodologies like statistical computing and model assessment. In image analysis, new Bayesian models for spatial processes enable researchers to match anatomically similar regions across image data sets. These methods facilitate statistical analyses of image data across patients. They also promise important diagnostic benefit through quantitative measurements of disease progression over longitudinally-matched image data. Ordinal and rank data are common in public health, and Bayesian methods allow such data collected from multiple raters to be combined, and permit the study of rater attributes. Current applications of this methodology include the study of a physician's ability to assign images consistently to disease classes, and the extent to which they agree on the thresholds used in class definitions. Methodological work includes new diagnostics to assess the convergence of numerical algorithms and tools to assess the adequacy of Bayesian models in describing population variability.
Today, nearly every statistical analysis is performed on a computer. Some methods are particularly dependent on intensive computing or custom software. Several of our faculty are involved with this specialty, known as computational statistics. Some faculty analyze massive datasets. For example, in functional magnetic resonance imaging (fMRI) data, a single dataset consists of 100 million elements. Many faculty create software which is used throughout the world, including tools for the analysis of genetic data (e.g. for genotype error detection, and for linkage and association analysis in pedigrees) and brain imaging data (e.g. for nonparametric analysis of PET and fMRI data). Custom software is necessitated by complex data structures or for graphical methods for exploring data. Another area of interest to our faculty is permutation or resampling methods, which allow inferences under weak assumptions, but require analyzing variations on the data thousands of times over. An essential tool for Bayesian modeling is Markov Chain Monte Carlo (MCMC). This computationally intensive simulation procedure is used to characterize complex high-dimensional posterior distributions.
Correlated data are common in many health sciences studies, where clustered, hierarchical and spatial data are frequently observed. A common feature of such data is that observations are correlated and statistical analysis requires taking such correlation into account. Examples of clustered data include longitudinal data, familial data, and analysis of multiple outcomes or recurrent events. Hierarchical data are common in multi-center clinical trials and community/school-based intervention studies, where correlation is due to several levels of clustering, such as schools and classes. Spatial data arise in disease mapping, ecology, environmental health and brain imaging, where data are correlated due to spatial proximity. Faculty in Biostatistics are engaged in the design and the development of statistical methodology for such correlated data. Examples of research areas include random effects models, estimating equations, missing data, multiple outcomes, nonparametric/semiparametric regression, measurement error models, and joint modeling survival and longitudinal outcomes.
In contrast to parametric modeling, where the distribution of the data is assumed known up to a finite-dimensional parameter, nonparametric methods involve an infinite dimensional parameter. Nonparametric methods are widely used in biomedical research. For example, logrank tests and Kaplan-Meier estimates are standard tools in analyzing censored survival data. A semiparametric model is intermediate between parametric and nonparametric models, and contains finite-dimensional and infinite-dimensional parameters. For example, the widely used Cox model survival data is semiparametric. Research in semiparametric models has been intense over the past two decades. In both nonparametric and semiparametric modeling, empirical methods and smoothing are two major ways to deal with the infinite-dimensional parameter. Faculty in biostatistics are developing new methodology and applying nonparametric and semiparametric techniques in clinical trials, survival analyses, recurrent events, longitudinal studies, and missing data problems.
Empirical studies in the social, behavioral, economic, and medical sciences frequently suffer from missing data. For instance, sample surveys often have some individuals who either refuse to participate or do not supply answers to certain questions, and panel surveys or longitudinal studies often have incomplete data due to attrition. Simple approaches to handling the missing data, such as discarding incomplete cases or filling in estimates of the missing values, often yield biased or inefficient statistical inferences. Faculty in Biostatistics work on developing better methods for analyzing missing data, using models for the data and missing data mechanism, and computational tools such as the EM algorithm and the Gibbs' sampler.
In many medical and scientific studies, investigators are interested in analyzing data on time to an event. Applications of this work arise in areas as diverse as medicine, epidemiology, demography and engineering. In such event history data, interest centers on the timing and occurrence of various kinds of events such as repeated infections, recurrences of disease, or sequences of events that occur through the study period. Further generalizations of these problems include issues of competing risks, complex sampling and censoring mechanisms, and incorporation of time-dependent or longitudinal covariates. The analysis of survival data is an area of great strength in this department. Several of our faculty and students are working in this general area and have made important and fundamental contributions through many research articles, books, and applications. A variety of approaches for the analysis of survival data, including frequentist and Bayesian methods, are being developed at Michigan.
Biomedical research, an information-based discipline, is undergoing a major revolution as novel experimental approaches are yielding unprecedented amounts of data. Automation and robotics are becoming integral parts of experimental processes, impacting the way academic and industrial research is carried out. Experimental biology and medicine are becoming increasingly dependent on the extensive application of statistics information sciences. Bioinformatics, the interdisciplinary field at the intersection of life and quantitative sciences, provides the necessary tools and resources for this endeavor. Modern fundamental and applied research in the life sciences is critically dependent on this relatively new discipline. Faculty in the Department of Biostatistics at the University of Michigan are playing a major role in the development of statistical methods in bioinformatics. In collaboration with medical and scientific researchers at the University of Michigan as well as at other national and international institutions, faculty are developing procedures for the analysis of data such as single nucleotide polymorphism (SNP), gene and protein expression data, and modeling techniques for systems biology.
Recent advances in medical imaging technology allows the measure of brain activity of the intact, living human brain. Faculty at UM Biostatistics work closely with researchers throughout the university to study normal brain function and how diseased patients differ from normals. For example, investigators in UM Psychology use Functional Magnetic Resonance Imaging (fMRI) to identify brain regions responsible for working memory, the short term memory used to retain, for example, a grocery list. Investigators in UM Psychiatry use Positron Emission Tomography (PET) and fMRI to study schizophrenic patients, to understand how their reactions to emotionally provocative images differ from that of normal controls. The statistical methods applied in this area are computationally intensive and include Bayesian and massively univariate classical approaches.
Many faculty and students are actively involved in a broad spectrum of cancer research projects and in developing statistical methodology motivated by cancer research. The department has close links with the University of Michigan Comprehensive Cancer Center. Professor Jeremy Taylor is director of the Cancer Center Biostatistics Unit and oversees many of these research activities. Examples of specific projects include the analysis of gene expression microarray data to profile lung, ovarian, and prostate cancer; design and analysis of clinical trials to test new therapeutic agents; analysis of epidemiologic data from a population based study of African-American men; analysis of animal and human brain magnetic resonance imaging data to obtain early indications of the response to chemotherapy; and analysis of biomarkers for the early detection of cancer. Statistical methodology development is an integral part of these projects, examples of this are the development of methods for the analysis of microarray data, developing methods to combine biomarkers, developing more efficient designs for phase 1 clinical trials, developing methods for evaluating surrogate endpoints, missing data problems and developing joint models for longitudinal and survival data.
Clinical trial research involves the study of novel therapies in patients with the purpose of identifying the best possible treatment for future use. Our faculty are highly involved in the design, conduct, and analysis of single and multi-center clinical trials in cancer, heart disease, diabetes, hepatitis and pulmonar fibrosis as well as trials in sleep disorders, women's and neonatal health and in the treatment of drug abuse. Our proximity to the excellent University of Michigan Medical School and Comprehensive Cancer Center allows high quality learning experiences for graduate students interested in clinical research. Our faculty and students are developing statistical methodologies that identify promising therapies more quickly and less expensively. Other research interests include developing strategies for: reducing or eliminating bias due to informative censoring, gaining information from auxiliary variables, incorporating information about quality of life, group sequential monitoring of trials in non-standard situations, flexibly accounting for measurement error in assessing treatment effects, validating use of surrogate endpoints, conducting cross-over trials subject to censoring and determining the maximum tolerated dose while considering both toxicity and efficacy outcomes.
Endocrinology is loosely defined as the study of hormone secretion and the action of hormones on their target cells. Hormones are secreted by specialized cells and concentration levels are controlled by complex feedback mechanisms, some of which are understood and many of which are still a mystery. The secretion of hormones come in many different "flavors". Some are oscillatory in nature, other pulsatile or follow a diurnal rhythm. Yet others are controlled through a menstrual or seasonal or developmental rhythm. Concentration levels can be assayed through blood or urine samples. Investigators at the University of Michigan are interested in many aspects of normal and abnormal control of hormone concentration levels and the complex feedback mechanisms that control these levels. Statistical methods that have been used to study hormone secretion are time series (classical and dynamic models), Bayesian statistics, biomathematical models and non-parametric statistics.
Faculty and students are actively involved in a broad spectrum of methodological and collaborative research in Epidemiology, Health Behavior and Health Education, Environmental Health, and Health Policy and Management. Collaborative projects with Epidemiology faculty include gene microarray data from epidemiological studies, studies of social inequality and psychosocial/economic factors in disease prevention, a national longitudinal study of women health (SWAN), a cohort study of coronary artery calcification, and studies of reproduction. Collaborative projects include studies of the effects of air-pollution and interventions on children with asthma, school-based intervention studies on children with asthma, intervention studies on women with heart disease, longitudinal studies on school dropout and substance abuse, a national drug abuse treatment survey, and national cost analysis of end-stage renal disease. Many areas of methodological research have application to problems in public health including case-control and two stage sampling, survival analysis, disease mapping, group randomization trials and spatial analysis.
Driven by advances in genome technology, the Human Genome Project, International HapMap Project, and 1000 Genomes Project, genetics is taking an ever more central role in all the biomedical sciences. These advances have in turn resulted in an explosive increase in the quantity and variety of genetic data. Faculty and students at UM Biostatistics play a leading role in a wide range of genetic studies, often in collaboration with other investigators at the UM Center for Statistical Genetics and around the world. Specific studies seek to identify genes that play a role in human diseases such as diabetes, asthma, psoriasis, cancer, bipolar disorder, macular degeneration, that allow discrimination of different disease or tumor subtypes, and that explore human genetic variation. Faculty and students also are working to develop new statistical designs and analytic and computational methods to ensure the efficient generation and use of genetic data from a wide range of genetic studies, including genome-wide association studies, targeted, whole exome, and whole genome sequencing studies, and expression studies. The statistical approaches used in this area include likelihood-based and Bayesian methods, and often are computationally intensive.
There are more than 300,000 people alive today in the United States because they are receiving ongoing replacement treatment for their failed kidneys, hearts, livers, or other organs. Dialysis is the most common treatment for kidney failure while transplantation is a common treatment for liver, heart, and also kidney failure. Data are available for nearly all patients for many aspects of these diseases due to the availability of federal health insurance for kidney failure patients in the U.S. and the use of a national organ sharing system for all transplanted organs in the U.S. These data systems track many aspects of patient condition, treatment methods, outcomes, and costs through the course of these diseases. Several national studies of organ failure are based at the University of Michigan with collaborative work carried out by the Departments of Biostatistics, Internal Medicine, Surgery, Epidemiology, and Health Management and Planning. The study of population based data, rather than controlled experimental outcomes, requires careful attention to research design and control of bias. Many opportunities exist for collaborative efforts to develop statistical methods to deal with these issues in longitudinal data analyses.