• English
    • svenska
  • svenska 
    • English
    • svenska
  • Logga in
Redigera dokument 
  •   Startsida
  • Faculty of Science / Naturvetenskapliga fakulteten
  • Department of Mathematical Sciences / Institutionen för matematiska vetenskaper
  • Licentiate Thesis / Licentiatuppsatser Institutionen för matematiska vetenskaper
  • Redigera dokument
  •   Startsida
  • Faculty of Science / Naturvetenskapliga fakulteten
  • Department of Mathematical Sciences / Institutionen för matematiska vetenskaper
  • Licentiate Thesis / Licentiatuppsatser Institutionen för matematiska vetenskaper
  • Redigera dokument
JavaScript is disabled for your browser. Some features of this site may not work without it.

Resampling in network modeling of high-dimensional genomic data

Sammanfattning
Network modeling is an effective approach for the interpretation of high-dimensional data sets for which a sparse dependence structure can be assumed. Genomic data is a challenging and important example. In genomics, network modeling aids the discovery of biological mechanistic relationships and therapeutic targets. The usefulness of methods for network modeling is improved when they produce networks that are accompanied by a reliability estimate. Furthermore, for methods to produce reliable networks they need to have a low sensitivity to occasional outlier observations. In this thesis, the problem of robust network modeling with error control in terms of the false discovery rate (FDR) of edges is studied. As a background, existing types of genomic data are described and the challenges of high-dimensional statistics and multiple hypothesis testing are explained. Methods for estimation of sparse dependency structures in single samples of genomic data are reviewed. Such methods have a regularization parameter that controls sparsity of estimates. Methods that are based on a single sample are highly sensitive to outlier observations and to the value of the regularization parameter. We introduce the method ROPE, resampling of penalized estimates, that makes robust network estimates by using many data subsamples and several levels of regularization. ROPE controls edge FDR at a specified level by modeling edge selection counts as coming from an overdispersed beta-binomial mixture distribution. Previously existing resampling based methods for network modeling are reviewed. ROPE was evaluated on simulated data and gene expression data from cancer patients. The evaluation shows that ROPE outperforms state-of-the-art methods in terms of accuracy of FDR control and robustness. Robust FDR control makes it possible to make a principled decision of how many network links to use in subsequent analysis steps.
Utgivare
University of Gothenburg and Chalmers University of Technology
URL:
http://hdl.handle.net/2077/52101
Samlingar
  • Licentiate Thesis / Licentiatuppsatser Institutionen för matematiska vetenskaper
Fil(er)
Licentiate thesis (2.717Mb)
Datum
2017
Författare
Kallus, Jonatan
Nyckelord
high-dimensional data
sparsity
model selection
bootstrap
genomics
graphical modeling
Publikationstyp
licentiate thesis
Språk
eng
Metadata
Visa fullständig post

DSpace software copyright © 2002-2016  DuraSpace
gup@ub.gu.se | Teknisk hjälp
Theme by 
Atmire NV
 

 

Visa

VisaSamlingarI datumordningFörfattareTitlarNyckelordDenna samlingI datumordningFörfattareTitlarNyckelord

Mitt konto

Logga inRegistrera dig

DSpace software copyright © 2002-2016  DuraSpace
gup@ub.gu.se | Teknisk hjälp
Theme by 
Atmire NV