Thao Tri Nguyen - Academia.edu (original) (raw)

Related Authors

James Elkins

Estela Blaisten-Barojas

David Seamon

Remo Caponi

Armando Marques-Guedes

Prof. Dr. Raffaele Pisano, HDR (Habil.)

Viacheslav Kuleshov

Pedro de Andres

Pedro de Andres

CSIC (Consejo Superior de Investigaciones Científicas-Spanish National Research Council)

Florentin Smarandache

Michel  Bitbol

Uploads

Papers by Thao Tri Nguyen

Research paper thumbnail of PRS-on-Spark (PRSoS): a novel, efficient and flexible approach for generating polygenic risk scores

BMC bioinformatics, Jan 8, 2018

Polygenic risk scores (PRS) describe the genomic contribution to complex phenotypes and consisten... more Polygenic risk scores (PRS) describe the genomic contribution to complex phenotypes and consistently account for a larger proportion of variance in outcome than single nucleotide polymorphisms (SNPs) alone. However, there is little consensus on the optimal data input for generating PRS, and existing approaches largely preclude the use of imputed posterior probabilities and strand-ambiguous SNPs i.e., A/T or C/G polymorphisms. Our ability to predict complex traits that arise from the additive effects of a large number of SNPs would likely benefit from a more inclusive approach. We developed PRS-on-Spark (PRSoS), a software implemented in Apache Spark and Python that accommodates different data inputs and strand-ambiguous SNPs to calculate PRS. We compared performance between PRSoS and an existing software (PRSice v1.25) for generating PRS for major depressive disorder using a community cohort (N = 264). We found PRSoS to perform faster than PRSice v1.25 when PRS were generated for a ...

Research paper thumbnail of PRS-on-Spark (PRSoS): a novel, efficient and flexible approach for generating polygenic risk scores

BMC bioinformatics, Jan 8, 2018

Polygenic risk scores (PRS) describe the genomic contribution to complex phenotypes and consisten... more Polygenic risk scores (PRS) describe the genomic contribution to complex phenotypes and consistently account for a larger proportion of variance in outcome than single nucleotide polymorphisms (SNPs) alone. However, there is little consensus on the optimal data input for generating PRS, and existing approaches largely preclude the use of imputed posterior probabilities and strand-ambiguous SNPs i.e., A/T or C/G polymorphisms. Our ability to predict complex traits that arise from the additive effects of a large number of SNPs would likely benefit from a more inclusive approach. We developed PRS-on-Spark (PRSoS), a software implemented in Apache Spark and Python that accommodates different data inputs and strand-ambiguous SNPs to calculate PRS. We compared performance between PRSoS and an existing software (PRSice v1.25) for generating PRS for major depressive disorder using a community cohort (N = 264). We found PRSoS to perform faster than PRSice v1.25 when PRS were generated for a ...

Log In