The growing field of genomics may produce more data in 10 years than huge services like YouTube and Twitter, according to a team of scientists. In a paper published in the open-access journal PLoS Biology, Zachary Stephens of the University of Illinois at at Urbana-Champaign and others argue that as the ability to map genomes gets easier and faster, the amount and variety of data produced will exceed the ability to handle it. "Now is the time for concerted, community-wide planning for the 'genomical' challenges of the next decade," they write in the paper's abstract.
In addition to the many plants and animals having their genetic codes scrutinized, millions — perhaps hundreds of millions — of humans will also be having their DNA sequenced. Not only will those billions of base pairs have to be stored, but they'll need to be analyzed, compared with others, and transferred between hospitals and research institutions where they're needed.
The data could reach into the exabytes — millions of times the storage space of ordinary computers — putting genomics in league with YouTube, Twitter and ultra-high-resolution astronomy sites.
The infrastructure for that just doesn't exist right now, the team warns — even basic standards need hammering out, so competing companies and sequencers can work seamlessly together. In a Nature editorial, others dissented, saying the estimates may be high — but that the problem is still very real.