Science Progress | Where science, technology, and progressive policy meet

New Transparency for Genomic Data

chromosome 21 informationThe National Human Genome Research Institute recently posted a searchable database and spreadsheet of genome-wide association studies, or GWAS. The catalog includes data on 1309 single nucleotide polymorphisms, called SNPs, from articles in 296 publications. The table explains what traits were studied in each paper, the sample size, the relevant genes, and the statistical significance. All of this contributes to greater transparency and cross-pollination in the field of personalized medicine.

According to the website, the data comes from literature searches, media reports, and “and occasional comparisons with an existing database of GWAS literature,” the Human Genome Epidemiology Navigator. HuGE Navigator is a comprehensive site that connects genotypes and phenotypes to the research that demonstrates the links between them. HuGE Navigator also connects to GeneTests, which lists the labs that test for the genes and how effective the tests are. It also links to the Online Medelian Inheritance in Man, or OMIM, database which maps the entire genome, and the Pharmacogenomics Knowledge Base, which links genes to drug interactions.

Of course, all of this data takes some sifting in order to make sense of it. That is why the new GWAS list put up by NHGRI is useful. Not only does it establish connections between some of the genome research databases and interfaces, it distills and clarifies the most robust and statistically significant data.

As we move into an era of personalized medicine, one of the most crucial challenges for the scientific community will be not simply to collect more data but to devise better ways of disseminating it and making it accessible. It will be interesting to see how the federal government’s different research institutes as well as private research entities choose distill, repackage, and repurpose their data for different audiences. Some of the pitfalls that researchers will need to watch out for are inaccurate oversimplifications and misinterpretations. Nevertheless, experimentations with data dissemination should constantly evolve so that the research community and—increasingly—the clinical community can utilize genomic data easily, accurately, and appropriately.


Comments on this article

By clicking and submitting a comment I acknowledge the Science Progress Privacy Policy and agree to the Science Progress Terms of Use. I understand that my comments are also being governed by Facebook's Terms of Use and Privacy Policy.