This may be the world's only modern biology book that costs thousands of dollars to read as well as write.
The new bio-digital book, coded from a Harvard University researcher's writings on synthetic biology, represents the largest amount of data ever written into DNA. Because of how costly and complex it is to read and write genetic material, DNA is still far from a practical storage drive. Yet as the price of synthesizing and sequencing DNA continues to drop, it may become an interesting way of storing data for the very long term, said Sriram Kosuri, a Harvard bioengineer who was one of the bio-digital book's creators.
"It brings a different perspective into the storage field," Kosuri told InnovationNewsDaily.
"At this point, it's very premature to hope that it would actually become something practical," said Stefano Lonardi, a computational biologist at the University of California, Riverside, who was not part of the Harvard effort. Nevertheless, Lonardi said, the work is a step toward DNA storage in the future. "These are things that people have to do first in order to get to something practical," he said.
Turning an e-book into DNA
To turn text and pictures into a double helix, the book had to undergo several translations. First, Kosuri and his colleagues wrote an HTML file of a draft of the book that Harvard bioengineer George Church was writing at the time. HTML is the language Web developers use to write websites.
The biologists then turned the HTML into binary, the 1s and 0s that computers read. They decided to use the individual building blocks of DNA, commonly referred to by their one-letter initials, to represent the 1s and 0s. The building blocks A and C would represent 0s, they decided, while G and T would represent 1s. They then assembled strands of DNA representing their binary code. [ 10 Technologies Poised to Transform our World ]
One of the greatest challenges of building DNA from scratch is that it's expensive and difficult to create long, unbroken strings of the stuff. So Kosuri and his teammates decided they would make very many smaller pieces instead, tagging each piece with an address so that someone trying to read the strands would be able to put them in the correct order. Such pieces are easy for the latest DNA-reading technology, called next-generation sequencing, to process.
"The novelty of their approach is that instead of generating one DNA string to store the whole book, they broke the book essentially into chunks," said Lonardi, who was part of a project in 2008 that encoded 12 bits of binary into DNA. "From my point of view, that's the major insight."
DNA as an ideal storage medium
While Kosuri's project is decidedly modern – taking advantage of a DNA-reading technique that's been available for only a few years – trying to put messages into DNA is a longtime pet project for biologists. In 1999, one of the first efforts encoded "JUNE 6 INVASION: NORMANDY" — an actual spy message sent during World War II.
Biologists think DNA is a promising storage medium because it can theoretically hold so much data in a compact, durable form. "The density of the information is ridiculously high, much higher than a lot of experimental storage being discussed, mostly because it's 3D," Kosuri explained. DNA is able to store more data in a smaller space than hard disks and flash memory, and even more than storage methods under research such as quantum holography and 12-atom memory. [ IBM Advances Toward Building a Quantum Computer ]
DNA can last a long time. Both Kosuri and Lonardi mentioned archaeological work that has yielded genetic material thousands of years old. "You could ship these on a spaceship and hopefully someone will find it," Lonardi said.
Another point many researchers have made is that people are likely to continue to work on DNA-reading technologies, so reading mechanisms may never become obsolete the way cassette decks and VCRs have.
The continued work may bring down the price of reading DNA dramatically. Over the past eight years, DNA sequencing has dropped in price tenfold, Kosuri said. It's difficult to predict if prices will continue to drop at the same rate in the future, but if they do, his technique could be affordable in another decade, he said.
Drawbacks to a DNA hard drive
DNA data storage still has a long way to go before it makes an appearance in the local Best Buy, however. Storing even a small amount of data is still costly. Kosuri and his colleagues' book cost them thousands of dollars to synthesize and sequence, Kosuri said, and it was less than a megabyte in size. Larger works would probably cost proportionately more to make, Lonardi said. Meanwhile, a $10 flash drive can store 16 gigabytes of data.
Kosuri's method is not rewriteable, so once some data has been stored, it can't be altered.
Computer scientists who work on improving data storage for the future generally don't look at DNA, said Darrell Long, director of the Storage Systems Research Center at the University of California, Santa Cruz.
Lonardi said biologists usually take on writing messages in DNA as a side project because the idea is cool and fun. Yet such just-for-kicks projects are important, he added. "We need this sort of pioneering work, and we'll need a lot of this before we can get to something practical."
Kosuri and his colleagues published a paper on their DNA book today (Aug. 16) in the journal Science.