Back to articles

GRCh37 vs GRCh38: Why genome build matters before checking a variant

Learn how GRCh37 vs GRCh38 affects variant lookup, where to find build clues, and why CHROM, POS, REF, and ALT should be checked before using ClinVar context.

You paste chr13:32316461 from a VCF into ClinVar or a genome browser. One resource returns nothing. Another seems to show a nearby position. A third shows a record, but the allele does not look like your file. Before assuming the file is wrong, ask a narrower question: are you comparing the same address system and the same allele spelling? In GRCh37 vs GRCh38 lookups, coordinate matching comes before interpretation.

The failed lookup: one fictional VCF row

Use this fictional row only as a lookup example, not as a disease example:

CHROM=chr13, POS=32316461, ID=., REF=G, ALT=A

VCF is a standard format for representing genetic variation. A VCF row can separate the coordinate from the alleles and from an optional identifier. That separation is useful, but it also creates a common shortcut: copying only the chromosome and position.

PieceIn the sample rowHow to use it in a lookup
CHROMchr13The chromosome label used by the file or tool
POS32316461The position number on a particular reference build
ID.No identifier is being used in this sample field
REFGThe reference allele spelling to compare
ALTAThe alternate allele spelling to compare
Reference buildNot shown in this rowThe coordinate system that makes the position meaningful

A better working note is not just chr13:32316461. Write: original row chr13, 32316461, REF=G, ALT=A, build not yet documented. That keeps the file evidence separate from whatever a search tool returns later.

What GRCh37 vs GRCh38 changes

The Genome Reference Consortium maintains human reference assembly resources. Labels such as GRCh37 and GRCh38 identify different human reference assembly contexts. The practical point is simple: a position number is an address on a selected reference assembly. The same text coordinate should not be assumed to point to the same lookup result under every build setting.

That does not mean every failed search is a build mismatch. The problem could be the file metadata, the lookup resource setting, the way the coordinate was copied, or an allele mismatch. You may also see labels such as hg19 vs hg38 in tools. Treat those as build clues to verify in the specific tool or documentation, not as something to infer from the coordinate alone.

Why chr, rsID, and gene name are not build proof

Three search aids can be useful without solving the build question.

  • A chr prefix can help with chromosome naming style, but it is not a reference genome build label.
  • An rsID can help you navigate to a record, but it should not replace the CHROM, POS, REF, ALT, and build checks.
  • A gene name can lead to many records. It is not one exact coordinate-and-allele statement.

This is workflow guidance, not a claim that these labels are useless. Use them after you have preserved the original row, and keep checking whether the returned record actually matches the file fields.

A build-check workflow before variant lookup

Use this order before deciding that a lookup has failed:

  1. Preserve the original row. Copy CHROM, POS, ID, REF, and ALT exactly as written.
  2. Inspect the VCF header. Look for metadata that mentions reference, assembly, contigs, source, or export pipeline.
  3. Check provider documentation and download notes. The build may be described outside the data rows.
  4. Check the lookup tool setting. A browser or database may have a selected genome, assembly, or build option.
  5. Search in the same build context when you can. Then compare the returned CHROM, POS, REF, and ALT.
  6. If the build is not documented, label it unknown. Do not treat a missing reference header as proof that the file uses GRCh37.
  7. If a coordinate was converted or remapped, keep the original coordinate, original build if known, REF, and ALT beside the converted value.

This workflow is meant to prevent silent substitutions. A converted coordinate may be useful for navigation, but it is not stronger evidence than the original file supports.

Common mistake or trap: treating the first match as the answer

The most practical mistakes are shortcuts that make a lookup look cleaner than it really is:

  • Assuming GRCh37, GRCh38, hg19, or hg38 positions are interchangeable across resources.
  • Searching an rsID and skipping the build and REF/ALT comparison.
  • Treating a missing reference or build header as proof of GRCh37 instead of marking the build unknown.
  • Looking only at chromosome and position while ignoring REF and ALT.
  • Reading a converted coordinate as if conversion added clinical certainty.
  • Treating a ClinVar search result as a personal medical answer rather than database context that still needs careful review.

When in doubt, return to the original row. If your notes no longer show the original CHROM, POS, REF, ALT, and build status, the lookup has become harder to audit.

Where ClinVar fits after sanity checking

A clean coordinate-and-allele match answers a limited lookup question: the file row and the resource result appear to be describing the same address and allele in the same build context. It does not diagnose a condition, predict risk, or validate that the file is suitable for clinical use.

ClinVar fits after these sanity checks. ClinVar help pages provide context for using records and understanding submitted variant information. If you find a ClinVar page, read it as database context linked to a record. Do not treat the page as a personal medical conclusion, and do not assume it applies to your file unless the build and allele comparison still make sense.

Decision aid: when a build mismatch may explain a failed variant lookup

Lookup symptomFirst checkWhy it mattersSafe next step
No result for chr13:32316461File build and resource assemblyThe coordinate is an address on a reference assemblyConfirm the build before changing the coordinate
Result appears at a different positionBuild labels, then REF/ALTDifferent build contexts can make comparison misleadingDo not assume the file is wrong yet
chr prefix but no buildHeader, documentation, export settingsNaming style is not a build labelTreat the build as unknown until documented
rsID search returns a recordOriginal CHROM/POS/REF/ALTAn identifier should not replace allele checksKeep the original VCF row in your notes
Gene search finds many recordsExact row fields and buildA gene name is not one variant statementUse the gene only for navigation
No obvious reference headerProvider or export documentationMissing metadata is not evidence for GRCh37Avoid defaulting to a build
Coordinate was convertedOriginal and converted coordinatesConversion is a lookup step, not validationUse the converted value cautiously
ClinVar page is foundDisplayed context and allelesClinVar context follows sanity checksDo not make a personal medical conclusion

Next step: keep the review local and educational

If you are learning how to inspect supported DNA or genome files with ClinVar-linked context, BioDecode's guide explains a local educational review workflow. It does not diagnose, predict disease risk, or resolve every metadata gap. Start with the same discipline: original row, documented build when available, allele comparison, then context. Continue with the BioDecode guide.

Next step

See how BioDecode keeps genome analysis on your own machine.

Explore BioDecode

This article is educational and is not medical advice.