Visa enkel post

dc.contributor.authorDmitrii, Zholud
dc.date.accessioned2011-10-13T12:54:33Z
dc.date.available2011-10-13T12:54:33Z
dc.date.issued2011-10-13
dc.identifier.isbn978-91-628-8354-6
dc.identifier.urihttp://hdl.handle.net/2077/27833
dc.description.abstractThis thesis presents results in Extreme Value Theory with applications to High-Throughput Screening and Bioinformatics. The methods described here, however, are applicable to statistical analysis of huge datasets in general. The main results are covered in four papers. The first paper develops novel methods to handle false rejections in High-Throughput Screening experiments where testing is done at extreme significance levels, with low degrees of freedom, and when the true null distribution may differ from the theoretical one. We introduce efficient and accurate estimators of False Discovery Rate and related quantities, and provide methods of estimation of the true null distribution resulting from data preprocessing, as well as techniques to compare it with the theoretical null distribution. Extreme Value Statistics provides a natural analysis tool: a simple polynomial model for the tail of the distribution of p-values. We exhibit the properties of the estimators of the parameters of the model, and point to model checking tools, both for independent and dependent data. The methods are tried out on two large scale genomic studies and on an fMRI brain scan experiment. The second paper gives a strict mathematical basis for the above methods. We present asymptotic formulas for the distribution tails of probably the most commonly used statistical tests under non-normality, dependence, and non-homogeneity, and derive bounds on the absolute and relative errors of the approximations. In papers three and four we study high-level excursions of the Shepp statistic for the Wiener process and for a Gaussian random walk. The application areas include finance and insurance, and sequence alignment scoring and database searches in Bioinformatics.sv
dc.language.isoengsv
dc.relation.haspartI. Rootz ́en, H. and Zholud, D.S. (2011). Tail estimation methods for the number of false positives in high-throughput testing. Submitted.sv
dc.relation.haspartII. Zholud, D.S. (2011). Tail approximations for the Stu- dent t−, F−, and Welch statistics for non-normal and not necessarily i.i.d. random variables. Submitted.sv
dc.relation.haspartIII. Zholud, D.S. (2009). Extremes of the Shepp statistic for a Gaussian random walk. Extremes, 12(1):1-17. ::DOI::10.1007/s10687-008-0065-3sv
dc.relation.haspartIV. Zholud, D.S. (2008). Extremes of the Shepp statistic for the Wiener process. Extremes, 11(4):339-351. ::DOI::10.1007/s10687-008-0061-7sv
dc.subjectExtreme Value Statistics, High-Throughput Screening, HTS, Bioinformatics, analysis of huge datasets, quality control, correction of theoretical p-values, comparison of pre-processing methods, SmartTail, estimation of False Discovery Rates, test power, distribution tail, high level excursions, quantile estimation, multiple testing, Student t−test, Welch statistic, small sample sizes, F−test, Wiener process, Gaussian random walk, Shepp statistic, limit theorems, exotic options.sv
dc.titleExtreme Value Analysis of Huge Datasets: Tail Estimation Methods in High-Throughput Screening and Bioinformaticssv
dc.typeText
dc.type.svepDoctoral thesiseng
dc.gup.maildmitrii@zholud.comsv
dc.type.degreeDoctor of Philosophysv
dc.gup.originGöteborgs universitet. Naturvetenskapliga fakultetensv
dc.gup.departmentDepartment of Mathematical Sciences ; Institutionen för matematiska vetenskapersv
dc.gup.defenceplaceTorsdagen den 3 november 2011, kl. 10.15, Hörsal Pascal, Matematiska Vetenskaper, Chalmers Tvärgata 3sv
dc.gup.defencedate2011-11-03
dc.gup.dissdb-fakultetMNF


Filer under denna titel

Thumbnail
Thumbnail

Dokumentet tillhör följande samling(ar)

Visa enkel post