Date of Completion

9-16-2016

Embargo Period

9-16-2016

Keywords

extreme value analysis, goodness-of-fit testing, regional frequency analysis, r largest order statistics, generalized extreme value, generalized pareto, sequential testing, threshold selection, spatial extremes, R package eva

Major Advisor

Jun Yan

Associate Advisor

Kun Chen

Associate Advisor

Dipak K. Dey

Associate Advisor

Xuebin Zhang

Field of Study

Statistics

Degree

Doctor of Philosophy

Open Access

Open Access

Abstract

Although the fundamental probabilistic theory of extremes has been well developed, there are many practical considerations that must be addressed in application. The contribution of this thesis is four-fold. The first concerns the choice of r in the r largest order statistics modeling of extremes. Practical concern lies in choosing the value of r; a larger value necessarily reduces variance of the estimates, however there is a trade-off in that it may also introduce bias. Current model diagnostics are somewhat restrictive, either involving prior knowledge about the domain of the distribution or using visual tools. We propose a pair of formal goodness-of-fit tests, which can be carried out in a sequential manner to select r. A recently developed adjustment for multiplicity in the ordered, sequential setting is applied to provide error control. It is shown via simulation that both tests hold their size and have adequate power to detect deviations from the null model.

The second contribution pertains to threshold selection in the peaks-over-threshold approach. Existing methods for threshold selection in practice are informal as in visual diagnostics or rules of thumb, computationally expensive, or do not account for the multiple testing issue. We take a methodological approach, modifying existing goodness-of-fit tests combined with appropriate error control for multiplicity to provide an efficient, automated procedure for threshold selection in large scale problems.

The third combines a theoretical and methodological approach to improve estimation within non-stationary regional frequency models of extremal data. Two alternative methods of estimation to maximum likelihood (ML), maximum product spacing (MPS) and a hybrid L-moment / likelihood approach are incorporated in this framework. In addition to having desirable theoretical properties compared to ML, it is shown through simulation that these alternative estimators are more efficient in short record lengths.

The methodology developed is demonstrated with climate based applications. Last, an overview of computational issues for extremes is provided, along with a brief tutorial of the R package eva, which improves the functionality of existing extreme value software, as well as contributing new implementations.

COinS