Title

Approximations and inequalities for discrete scan statistics

Date of Completion

January 1998

Keywords

Statistics

Degree

Ph.D.

Abstract

Let $X\sb{1},\cdots,X\sb{n}$ be a sequence of independent and identically distributed non-negative integer valued random variables. For $2 \le m \le n$, consider the moving sums of m consecutive observations. The discrete scan statistic is defined as the maximum value of these moving sums. Conditional on the sum of all the observations, we refer to this scan statistic as the conditional scan statistic.^ Discrete scan statistics are used for testing uniformity against a clustering alternative that specifies an increased incidence of events in a connected sub-region. The discrete scan statistics have applications in many areas of science including: biology, epidemiology, minefield detection and reliability. To implement the testing procedure based on the scan statistics, accurate approximations or inequalities are needed for their distributions, since there are usually no exact results available.^ In this thesis, accurate product-type approximations, Bonferroni-type inequalities, Poisson-type and compound Poisson approximations have been derived for the distribution of the conditional and unconditional discrete scan statistics. Based on these approximations accurate approximations for the expected size and the standard deviation of the scan statistics are also derived. Moreover, these results for conditional and unconditional discrete scan statistics are extended to the circular case. Numerical results and simulation studies are presented to evaluate the performance of these approximations.^ Let $Y\sb{i,j}, 1 \le i \le n\sb{1}, 1 \le j \le n\sb{2}$, be a sequence of independent and identically distributed nonnegative integer valued random variables. The observation $Y\sb{i,j}$ denotes the number of events that have occurred in the (i, j) location in a two-dimensional rectangular region. For $2 \le m\sb{i} \le n\sb{i},\ i = 1, 2,$ the two-dimensional discrete scan statistic is defined as the maximum number of events in any of the $m\sb{1} \times m\sb{2}$ consecutive rectangular subregions.^ In this thesis, product-type approximations, Bonferroni-type inequalities, Poisson-type and a compound Poisson approximations are derived for the distribution of the two-dimensional discrete scan statistic. Moreover, approximations for the expected size of the scan statistic and its standard deviation are derived based on these approximations. Numerical results and a simulation study is presented to evaluate the performance of these approximations. ^