Title

Variable window scan statistics

Date of Completion

January 2005

Keywords

Statistics

Degree

Ph.D.

Abstract

In the first part of this thesis variable window scan statistics are derived for independent and identically distributed 0-1 Bernoulli trials. Both one and two dimensional, as well as, conditional and unconditional cases are treated. The advantage in using a variable window scan statistic, as opposed to, a single fixed window scan statistic, is that it is more sensitive in detecting a change in the underlying distribution of the observed data. We show how to derive simple approximations for the significance level of these testing procedures and present numerical results to evaluate their performance. ^ We also introduce a maximum scan score-type statistic, based on several windows, for testing the null hypothesis that the observations are independent identically distributed (i.i.d.) according to a specified distribution, against an alternative that the observations cluster within a window of unknown length. This statistic can be considered as a variable window scan statistic. Approximations for the significance level of this statistic are derived for 0-1 i.i.d. Bernoulli trials and for i.i.d. uniform observations on the interval [0, 1). The advantage in using a maximum scan score-type statistic, rather than a single fixed window scan statistic, is that it is more effective in detecting window-type clustering of observations. ^ For detecting a local clustering of events generated by a discrete nonhomogeneous process in a two dimensional rectangular region, we propose to employ Bayesian type scan statistics. The data is modeled via two stage hierarchical Bayesian models. Two Bayesian variable window scan statistics are investigated to test the null hypothesis that the observed events follow a specified two stage hierarchical model versus an alternative that indicates a local increase in the mean (clustering) of events in a sub-region. Both procedures are based on a sequence of Bayes factors and their p-values that have been generated via simulation of the posterior samples of the parameters, under the null and alternative hypothesis. The posterior samples of the parameters have been generated by employing collapsed Gibbs sampling in conjunction with a localized Metropolis-Hastings algorithm. Numerical results are presented to evaluate the performance of these variable window scan statistics. ^