Launch Special — 50% off any report with code LAUNCH50 at checkout
Back to Blog

How We Build a Polygenic Risk Score

|greg@genomisaur.com|2 min read

When you get your Genomisaur report, each condition comes with a polygenic risk score and a percentile ranking. What actually goes into calculating that number? Here's how we turn raw genetic data into a personalized risk score, step by step.

Step 1: Genome-Wide Association Studies

Everything starts with published research. Genome-wide association studies (GWAS) scan the genomes of hundreds of thousands — sometimes millions — of people to find genetic variants associated with a specific condition or trait.

For each variant identified, the study reports an effect size: a number representing how much that variant changes risk. Most individual variants have tiny effects (maybe shifting risk by 0.1%), but there are thousands of them and they add up.

We only use peer-reviewed, well-replicated GWAS results. The quality of a PRS is bounded by the quality of the research it's built from.

Step 2: Your Genetic Data

When you upload your DNA file — from 23andMe, AncestryDNA, MyHeritage, or a whole genome sequencing provider — we extract the variants relevant to each PRS model.

Different testing platforms genotype different sets of variants. Consumer arrays like 23andMe typically measure 600,000–700,000 variants; whole genome sequencing captures essentially all of them. To handle the gap, we use a technique called imputation: a statistical method that infers missing variants from the ones that were measured, using patterns of genetic linkage observed in reference populations.

Imputation increases the number of variants we can use in the calculation, which improves the accuracy of the final score.

Step 3: Scoring

With your complete set of variants in hand, calculating the PRS is conceptually straightforward. For each variant in the model, we look at which version you carry and multiply it by the effect size from the GWAS. Then we sum all those weighted contributions across thousands of variants.

The formula is: PRS = sum of (effect size × number of risk alleles) for every variant in the model. The result is a single number, your raw polygenic risk score for that condition.

Step 4: Percentile Ranking

A raw PRS isn't very interpretable on its own. Is 0.73 a high score or a low one? It depends on the distribution. To make scores meaningful, we compare your raw score against a reference panel — a pre-calculated set of scores from a large, well-characterized population.

Your percentile tells you where you fall in that distribution. A score higher than 85% of the reference panel puts you in the 85th percentile. We use ancestry-aware reference panels where possible to make the comparison more accurate for your genetic background.

Step 5: Quality Control

Before any score lands in your report, it goes through quality checks. We verify that enough variants were measured or imputed, that the input data passes format and integrity checks, and that the scores fall within expected ranges. If a particular score can't be calculated reliably for your data, we flag it rather than show a misleading result.

Why This Matters

The PRS pipeline is a chain of evidence-based steps, each one grounded in published research. It is not a black box; every step has a paper trail. Genomisaur calculates 71 polygenic risk scores across 14 health categories using this pipeline, drawing on studies that involved millions of participants.