High performance pipeline for the calculation of Polygenic Risk Scores

Poster (download) Video (download) Arina Nostaeva1, Tatiana Shashkova2, Sodbo Sharapov3, Yakov Tsepilov4, Yurii Aulchenko5, Lennart C. Karssen61Laboratory of Recombination and Segregation Analysis, Institute of Cytology and Genetics, Novosibirsk, Russia, avnostaeva@gmail.com2Laboratory of Recombination and Segregation Analysis, Institute of Cytology and Genetics, Novosibirsk, Russia, shashkova@phystech.edu3Laboratory of Recombination and Segregation Analysis, Institute of Cytology and Genetics, Novosibirsk, Russia, sharapovsodbo@gmail.com4Laboratory of Recombination and Segregation Analysis, Institute of Cytology and Genetics, Novosibirsk, Russia, drosophila.simulans@gmail.com5Laboratory of Recombination and Segregation Analysis, Institute of Cytology and Genetics, Novosibirsk, Russia, yurii.aulchenko@gmail.com6PolyKnomics, ’s-Hertogenbosch, The Netherlands, l.c.karssen@polyknomics.com A polygenic risk score (PRS) is a value that reflects a person’s predisposition to a disease or any other trait which can (partly) be explained by genetic inheritance. PRSs are often used in reports provided by genetic testing companies like 23andMe, Genotek, etc. Another way of using PRSs is to look at the distribution of PRS values for a group of people and compare them, for example, in a case-control study to find case-dependent traits. PRS models are usually based on summary statistics data from genome-wide association studies (GWAS) and take into account the linkage disequilibrium (LD) structure. We have created a pipeline for high performance PRS calculations across many traits present in the GWAS-MAP platform. The pipeline only requires individual-level data and provides the ability to select a list of traits. This pipeline will be helpful for scientific groups, working with large amounts of individual genotype data, as well as for individuals with their own personal genotype data.

Read More