Genome in a Bottle
Authoritative Characterization of
Benchmark Human Genomes
The Genome in a Bottle Consortium is a public-private-academic consortium hosted by NIST to develop the technical infrastructure (reference standards, reference methods, and reference data) to enable translation of whole human genome sequencing to clinical practice. The priority of GIAB is authoritative characterization of human genomes for use in analytical validation and technology development, optimization, and demonstration. In 2015, NIST released the pilot genome Reference Material 8398, which is genomic DNA (NA12878) derived from a large batch of the Coriell cell line GM12878, characterized for high-confidence SNPs, indel, and homozygous reference regions (Zook, et al., Nature Biotechnology 2014 and Zook, et al., bioRxiv 2018).
There are four new GIAB reference materials available. With the addition of these new reference materials (RMs) to a growing collection of “measuring sticks” for gene sequencing, we can now provide laboratories with even more capability to accurately “map” DNA for genetic testing, medical diagnoses and future customized drug therapies. The new tools feature sequenced genes from individuals in two genetically diverse groups, Asians and Ashkenazic Jews; a father-mother-child trio set from Ashkenazi Jews; and four microbes commonly used in research. For more information click here. To purchase them, visit:
Data and analyses are publicly available (GIAB GitHub). A description of data generated by GIAB is published here. To establish best practices for using GIAB genomes for benchmarking, we have worked with the Global Alliance for Genomics and Health Benchmarking Team (benchmarking tools and manuscript).
High-confidence small variant and homozygous reference calls are available for NA12878, the Ashkenazim trio, and the Chinese son with respect to GRCh37 and GRCh38 (Zook, et al., bioRxiv 2018). The latest version of these calls is under the latest directory for each genome on the GIAB FTP. Current work in the GIAB Analysis Team is focused on establishing benchmark large indel and structural variant calls, as well as calls in difficult genomic contexts (e.g., difficult-to-map regions, tandem repeats).
The consortium was initiated in a set of meetings in 2011 and 2012, and the consortium holds open, public workshops typically annually at Stanford University in Palo Alto, CA or at NIST in Gaithersburg, MD. Slides from workshops and conferences are available here. The consortium is open and welcomes new participants.