Justine Shults, PhD
Professor of Biostatistics
Professor of Biostatistics
Dr. Shults is currently the principal investigator of the NIH-funded project “Longitudinal Analysis for Diverse Populations” (R01CA096885). Her work on this project involves quasi-least squares (QLS) and is based on Liang and Zeger’s (1986) generalized estimating equation (GEE) approach. She participated in the three papers in which QLS* was developed.
Shults (Doctoral Dissertation, “The Analysis of Unbalanced and Unequally Spaced Data using Quasi-Least Squares”, 1996) and Shults and Chaganty (Biometrics, 1998) extended stage one by obtaining estimates that yield a positive definite matrix for several correlation structures (including the AR(1)) and unbalanced data. For example, the estimate provided in closed form for the equicorrelated structure in Chaganty (JSPI, 1997) only holds for balanced data. Shults (1996) obtained an estimating equation for unbalanced data and proved that this equation will always have a unique solution in the feasible region (interval on which the matrix is positive definite). For unbalanced data, this estimate can be obtained using the bisection method.
However, the stage one QLS estimate of the correlation parameter typically is not consistent, even if the correlation structure is correctly specified. Chaganty and Shults (JSPI, 1999) therefore introduced a second stage of QLS that provides a solution to an unbiased estimating equation for the correlation parameter. For more discussion of QLS that includes a comparison of stage one of QLS versus stage two and a comparison with other approaches, see Sun et al. 2006.
Dr. Shults’ current focus is in extending QLS to allow for implementation of relatively complex patterned correlation structures that have not previously been applied for GEE. For example, in a recent paper with Ardythe Morrow, Ph.D (Biometrics 58: 521-530, 2002) the authors considered an international trial to promote exclusive breast-feeding in San-Pedro Martir, Mexico City. This study employed cluster randomization and followed mothers over time, to determine if they were exclusively breast-feeding their infants. As a result of the study design, there were two potential sources of correlation among the binary outcomes (breast-feeding yes/no) from this trial, in the sense that two measurements could be more similar, if they were collected within the same randomization group, or on the same subject. They adjusted for these two sources of correlation by directly modeling the association via a patterned correlation matrix constructed as the Kronecker product of two correlation matrices. She recently extended this approach for data with >2 sources of correlation in a manuscript with Melicia Whitt, PhD and Shiriki Kumanyika, PhD (Analysis of data with multiple sources of correlation in the framework of generalized estimating equations. Statistics in Medicine. 23: 3209-3226, 2004.) Dr. Shults also recently worked with Carissa Mazurick, MS, and Richard Landis, PhD, on implementation of a correlation structure appropriate for analysis of multiple bouts of repeated measurements, when the separation in time between bouts is large to the within bout separation in a mansucript that is in press in Statistics in Medicine (2006).
*A brief description of QLS: The method of quasi-least squares (QLS) is a two-stage approach in the framework of generalized estimating equations (GEE). Chaganty(JSPI, 1997) described stage one of QLS for an equal number of observations per subject (balanced data) and noted that “extensions of our results for unbalanced data… will appear elsewhere”, e.g. the proof of feasibility for the first-order autoregressive AR(1) correlation structure (Appendix A) is based on the strum sequence for tridiagonal matrices and does not generalize readily for unbalanced data.