Back to articles

How to Analyze Your 23andMe Raw Data (Privately, on Your Own Computer)

Download your 23andMe raw data, understand what the file really contains, and check it for health-relevant variants — without uploading it to anyone.

You paid for an ancestry test, read the curated reports, and then noticed the line in your account about raw data. That file holds far more than the reports 23andMe chooses to show you — and you can analyze it yourself without handing it to another website. Here is what the file is, how to get it, and how to read it without drawing conclusions it can't support.

What the 23andMe raw data file actually is

Your results come from a genotyping array, not whole-genome sequencing. The chip — currently the Illumina Global Screening Array, in use since 2017 — reads roughly 640,000 pre-selected positions. That is a fraction of one percent of your ~3 billion base pairs. The rest of your genome is never read.

The download is a plain-text, tab-separated file, delivered zipped with a name that starts with genome. Past the header comments, every line is one marker with four columns:

  • rsid — the marker's identifier, usually a dbSNP rs number
  • chromosome
  • position — the base-pair location, on the GRCh37 / build 37 reference
  • genotype — the two bases you carry there, such as AG

So rs429358 19 45411941 CT means "at marker rs429358 on chromosome 19, you carry one C and one T." That is the entire vocabulary of the file. What a marker means comes from cross-referencing it against a database — the file itself only records letters and positions.

How to download your raw data

  1. Sign in at 23andme.com and click your profile name (top right).
  2. Open Resources → Browse Raw Genotyping Data, then choose Download. The alternate path is Account Settings → 23andMe Data.
  3. Confirm the date of birth on the profile, read the notice, tick the box, and submit the request.
  4. 23andMe emails you when the file is ready — often within a day, sometimes longer.
  5. Find the genome*.zip in your downloads and unzip it. Keep it somewhere you control.

That last step matters more than it sounds. The moment the file leaves 23andMe's servers, protecting it is on you. Before you hand it to the next service, it's worth reading whether it's safe to upload your DNA at all.

Reading it for health-relevant variants

The question most people actually have is whether any of their markers are flagged in ClinVar, the NIH's public archive of variant–condition relationships. ClinVar records whether a variant has been classified as benign, uncertain, likely pathogenic, or pathogenic, and how much expert review stands behind that call.

Matching your file against ClinVar by hand is impractical — it means looking up hundreds of thousands of positions one at a time. Software does it in seconds. BioDecode does this on your own machine: it reads the genome.txt, matches each position against a local copy of ClinVar, and shows you only the entries that line up — no upload, no account. If that's the workflow you want, you can download BioDecode and run it offline.

However you do the matching, two cautions are not optional.

The two things that trip people up

Absence is not reassurance. The array only tests positions it was designed for. If a variant isn't in your file, that says nothing about whether you carry it — the chip simply may not look there. "Nothing found" is not a clean bill of health.

A match is not a diagnosis. Consumer raw data is uncurated; 23andMe states it is "suitable only for research, educational, and informational use." A 2018 study in Genetics in Medicine found that about 40% of variants flagged as concerning in direct-to-consumer raw data were false positives when checked against clinical-grade testing. A flag in your file is a lead to verify, not a verdict. Anything that looks medically significant should be confirmed in an accredited (CLIA) laboratory and discussed with a genetics professional.

Held to those limits, your raw data is genuinely useful — a starting point for understanding your own genome on your own terms. For how the classifications themselves work, see how to find pathogenic variants in your raw DNA. If your data came from Ancestry instead, the AncestryDNA walkthrough covers the differences.

Frequently asked questions

Is the 23andMe raw data file the same as having my genome sequenced?

No. It is genotyping-array data covering roughly 640,000 chosen positions, not a sequence of your whole genome. It cannot report variants at positions the chip does not test.

What format is the file, and what opens it?

A tab-separated text file (genome*.txt) inside a zip. Any text editor can open it, but it is large and not meant to be read line by line — a tool that matches it against a variant database is far more useful.

Can I analyze it without uploading it anywhere?

Yes. Because it is just a local text file, desktop software can match it against a local copy of ClinVar entirely offline. BioDecode is built for exactly that.

A variant I care about isn't in my file — does that mean I don't have it?

No. The array only tests pre-selected positions. A missing marker means the chip didn't look there, not that the variant is absent.

Next step

See how BioDecode keeps genome analysis on your own machine.

Explore BioDecode

This article is educational and is not medical advice.