Date of Completion

10-21-2015

Embargo Period

12-23-2016

Keywords

C-log-log link, Cure rate models, Maximum likelihood estimate (MLE), Logit, Probit, SEER breast cancer data, t-link, Detection Limit, Misclassi_cation, Negative Predictive Value (NPV), Positive Predictive Values (PPV), Hepatitis C Virus data

Major Advisor

Ming-Hui Chen

Associate Advisor

Lynn Kuo

Associate Advisor

Jun Yan

Field of Study

Statistics

Degree

Doctor of Philosophy

Open Access

Open Access

Abstract

Discrete survival data are routinely encountered in many fields of study. There are two common types of discrete survival data. The first type is derived discrete, which is originally continuous but recorded in a discrete version by grouping or rounding into a discrete time. The second type is intrinsically discrete. The dissertation research is motivated by two types of discrete survival data in clinical trials.

We develop a class of proportional exponentiated link transformed hazards (ELTH)models and a class of proportional exponentiated link transformed survival (ELTS) models. We examine the role of links in fitting discrete survival data and estimating regression coefficients. We also characterize the conditions for improper survival functions and the conditions for existence of the maximum likelihood estimates under the proposed ELTH models. An extensive simulation study is conducted to examine the empirical performance of the parameter estimates under the Cox proportional hazards model by treating discrete survival times as continuous survival times, and the model comparison criteria, AIC and BIC, in determining links and baseline hazards. A SEER breast cancer dataset is analyzed in details to further demonstrate the proposed methodology.

Previous research has shown that outcome misclassification can bias estimation of the survival function under standard survival methods. We develop methods to accurately estimate the survival function when the diagnostic tool used to measure the outcome of disease is not perfectly sensitive and specific. Since the diagnostic tool used to measure disease outcome is not the gold standard, the true outcomes cannot be observed. Our method uses the negative predictive value (NPV) and the positive predictive values (PPV) to construct a bridge between the mismeasured outcomes and the true outcomes. We formulate an exact relationship between the true and the observed survival functions as a formulation of time-varying NPV and PPV. We specify models for the NPV and PPV that depend only on parameters that can be easily estimated from a fraction of the observed data. Furthermore, we extend and conduct an extensive study to accurately estimate the latent survival function based on the assumption that the underlying disease process follow a stochastic process. We further examine the performance of our method by applying it to the VIRAHEP-C data.

COinS