Back to articles

How to Find Pathogenic Variants in Your Raw DNA Data

What 'pathogenic' really means, how ClinVar classifies variants, and how to check your raw DNA for flagged markers without over-interpreting the result.

"Pathogenic" is a heavy word, and seeing it next to one of your own markers can be alarming. It shouldn't be — not on its own. Understanding how that label is assigned, and how much weight a raw-data match really carries, is the difference between a useful starting point and an unnecessary scare. Here's how to check your raw DNA for flagged variants and read the result honestly.

What "pathogenic" actually means

A variant's classification doesn't describe you; it describes how much evidence links that variant to disease in the scientific literature. The standard framework comes from a 2015 joint guideline by the American College of Medical Genetics and Genomics (ACMG) and the Association for Molecular Pathology (AMP). It sorts variants into five tiers:

  • Pathogenic — strong evidence the variant causes or strongly contributes to disease
  • Likely pathogenic — good but not definitive evidence
  • Uncertain significance (VUS) — not enough evidence to call either way
  • Likely benign — probably harmless
  • Benign — harmless

Two things follow from this. A VUS is genuinely unknown, not a quiet "maybe bad" — most VUS entries are eventually reclassified toward benign as evidence accumulates. And even "pathogenic" describes the variant's general association with a condition, not your personal certainty of developing it, which depends on the gene, inheritance pattern, other variants, and factors no single marker captures.

Where the labels come from: ClinVar

These classifications are aggregated in ClinVar, a free public archive run by the NIH's National Center for Biotechnology Information. Labs, researchers, and expert panels submit their interpretations of specific variants, and ClinVar records them along with a review status — shown as stars — indicating how much expert agreement backs each classification. A multi-star, expert-reviewed pathogenic entry is on far firmer footing than a single unreviewed submission. When you check your DNA, you're really asking: do any of my markers appear in ClinVar, and what's the quality of the evidence behind them?

Checking your own file

Your raw file — whether a 23andMe or AncestryDNA text export or a sequencing VCF — is just positions and the bases you carry. Finding flagged variants means matching every one of those positions against ClinVar. That's not a manual job; it's hundreds of thousands of lookups for an array file and potentially millions for a VCF.

This is the job BioDecode does. It reads your file, matches each position against a local copy of ClinVar on your own machine, and shows you the variants with a recorded classification — along with the review status, so you can weigh the evidence rather than just the label. Nothing is uploaded. If you want to run that check yourself, you can download BioDecode. New to your file format first? Start with the 23andMe, AncestryDNA, or VCF walkthrough.

The part that keeps you out of trouble

Finding a pathogenic-labeled marker in raw data is the beginning of a question, not the end of one. Two facts keep it in perspective.

First, consumer raw data isn't clinically validated. 23andMe itself says its raw data is for "research, educational, and informational use." A 2018 study in Genetics in Medicine found that roughly 40% of variants flagged as concerning in direct-to-consumer raw data turned out to be false positives when re-tested with clinical-grade methods. The genotyping chip can misread a position, and that misread can look exactly like a real pathogenic variant.

Second, classification and review status matter as much as the label. A pathogenic call with one star and no expert review is weaker evidence than the same call reviewed by an expert panel. ClinVar shows you that context; use it. And if a marker you expected isn't listed at all, that's a different situation with its own checklist — see what to do when a variant isn't found in ClinVar.

So the responsible workflow is: use your raw data to flag candidates, note the classification and how well-reviewed it is, and treat anything potentially serious as something to confirm in an accredited (CLIA) laboratory and discuss with a genetic counselor or physician — never as a diagnosis on its own. Read that way, checking your raw DNA is a genuinely useful, private first look, and keeping the file off the cloud keeps that look entirely yours.

Frequently asked questions

Does a pathogenic variant in my raw data mean I have the disease?

No. The label describes the variant's evidence-based association with a condition, not your personal outcome. Many factors determine actual risk, and consumer raw data can produce false positives, so a flag should be confirmed clinically before it means anything for your health.

What's the difference between pathogenic and a VUS?

A pathogenic classification reflects strong evidence linking the variant to disease. A variant of uncertain significance (VUS) means there isn't enough evidence to classify it either way — it is unknown, not secretly harmful, and is often reclassified as benign over time.

How reliable is a ClinVar match from raw data?

It depends on two things: ClinVar's review status (how many experts back the classification) and whether the chip read your position correctly. A well-reviewed, multi-star entry confirmed by a clinical lab is reliable; a single-submission entry from an unconfirmed array read is not.

Can I check for pathogenic variants without uploading my DNA?

Yes. BioDecode matches your file against a local copy of ClinVar entirely on your own computer, so no upload is needed.

Next step

See how BioDecode keeps genome analysis on your own machine.

Explore BioDecode

This article is educational and is not medical advice.