摘要(英) |
In this paper, we are mainly interested in using the receiver operating characteristic (ROC) curve to determine which biomarker has better disease prediction. We consider the data that patient’s covariates and their disease status are both time dependent and, in general, this kind of data is justified by the Area under ROC curve (AUC). However, due to the time-dependent covariates, AUC values may vary (under different time points), which make us difficult to make inference (or decide which biomarker has better disease prediction). Thus, we adapt the volume under the ROC surface (VUS) approach instead-the larger the volume, the better the disease prediction. Here, we use the nearest neighbor estimation for a bivariate distribution to estimate the ROC curve. In simulation, we generate two biomarkers, and we are interested in which biomarker has better prediction. From the AUC values, we can know that the biomarker one is better than biomarker two, we compare biomarker one to the combination of biomarker one and biomarker two and by the AUC values, we can know that the linear combination of biomarkers has better prediction. We also use the VUS, we know the linear combination of biomarkers has better prediction. In the practical data analysis, two examples (cases) are given. First, we are interested in the biomarkers CD4 counts and viral load, which one has better prediction for the AIDS. From the AUC values, we can know that the CD4 counts is better than viral load. Second, we are interested in the biomarkers total number of eggs laid during lifetime, the time of maximum eggs laid and number of eggs laid daily, which one has more influence to medfly lifetime. From the AUC values, we can know that the total number of eggs laid during lifetime and number of eggs laid daily are better, but by volume under the ROC surface, number of eggs laid daily has more influence to medfly lifetime. |
參考文獻 |
Akritas, M. G. (1994). “Nearest neighbor estimation of a bivariate distribution under random censoring.” Annals of Statistics, 22, 1299-1327.
Carey, J. R., Liedo, P., M ller, H. G., Wang, J. L. & Chiou, J. M. (1998). ”Relationship of age patterns of fecundity to mortality, longevity,and lifetime reproduction in a large cohort of Mediterranean fruit fly females.” J. of Gerontology : Biological Sciences 53, 245-251.
Cleveland, W. S. (1979). “Robust Locally Weighted Regression and Smoothing Scatterplots.”Journal of the American Statistical Associtatio, 74, 829-836.
Cox, D. R. and Oakes, D. (1984). Analysis of Survival Data, Chapman and Hall,London, New York.
Dempster, A. P., Laird,N. M. and Rubin, D. B. (1977). “Maximum Likelihood from Imcomplete Data via the EM Algorithm.” Journal of the Royal StatisticalSociety Series B (Methodological), 39, 1-38.
Satten, G. A., Datta, S. and Robins, J. (2001). “Estimating the marginal survival function in the presence of time dependent covariates. ” Statistics and Probability Letters, 54, 397-403.
Hanley, J. A. (1989). “Receiver operating characteristic (ROC) methodology:the state of the art.” Critical Reviews in Diagnostic Imaging, 29, 307-335.
Heagrty, P. J., Lumley, T. and Pepe, M. S (2000). “Time-dependent ROC curves for censored survival data and a diagnostic marker.” Biometrics, 56, 337-344.
Henderson, R., Diggle, P. and Dobson, A. (2000). “Joint modeling of longitudinal measurements and event time data.” Biostatistics, 4, 465-480.
Jones, M. C. (1990). “The performance of kernel density functions in kernel distribution function estimation.” Statistics and Probability Letters, 9, 129-132.
Jones, M. C. and Sheather, S.J. (1991). “Using non-stochastic terms to advantage in kernel-based estimation of integrated squared density derivatives.” Statistics and Prob-ability Letters, 11, 511-514.
Hsieh, F., Tseng, Y. K. and Wang, J. L. (2006). “Joint Modeling of Survival and Longitudinal Data: Likelihood Approach Revisited. ” Biometrics, 62.1037-1043.
Tseng, Y. K., Hsieh F. and Wang, J. L. (2005). “ Joint modeling of accelerated failure time and longitudinal data.” Biometrika, 92, 587-603.
Tsiatis, A. A., Degruttola, V. and Wulfsohn, M. S. (1995). “Modeling the Relationship of Survival to Longitudinal Data Measured with Error. Applications to Survival and CD4 Coutns in Patients with AIDS.” Journal of the American Statistical Association, 90, 27-37.
Wulfsohn, M. S. and Tsiatis, A. A. (1997). “A Joint Model for Survival and Longitudinal Data Measured with Error.” Biometrics, 53, 330-339.
Zeng, D. and Lin, D. Y. (2007a). “Maximum Likelihood Estimation in Semiparametric Regression Models with Censored Data (with Discussion).” Journal of the Royal Statistical Society, Series B, 69, 507-564.
Zweig, M. H. and Campbell, G. (1993). “Receiver-operator characteristic plots: a fundamental evaluation tool in clinical medicine.” Clinical Chemistry, 39, 561-577. |