June 25, 2007 at 9:25 PM ET
Computer cases are lined up in CERN's Computing Center with room to spare.
The World Wide Web was born in 1990 to manage the billions of bytes of data from experiments at CERN, Europe's particle-physics laboratory. Now the same laboratory is gearing up for a new round of experiments that could generate more than a quadrillion bytes of data every month - data that will have to be processed and delivered to researchers around the world. Is there anything in sight that could outdo the Web? Say hello to the Grid.
CERN has been working for a decade on the foundations for the Grid, which is a next-generation network that draws upon storage space as well as processing power from linked computers. That's about as long as it's been preparing for the Large Hadron Collider, or LHC.
The LHC’s experiments will be taking in the details from millions of proton collisions every second. A lot of the less interesting data will be filtered out almost immediately by the "trigger" computer programs overseeing each experiment. Nevertheless, hundreds of megabytes of data will be dumped into CERN's central computer system every second.
"That means you fill a DVD in a few seconds," Francois Grey of CERN’s Computing Center told me. Over the course of a year, the center is due to store 15 billion megabytes of data – which you can also think of as 15 petabytes, 15 quadrillion bytes or (according to CERN) roughly 1 percent of all the information generated by humanity in the course of a year.
All those data will have to be available on demand for the 7,000 researchers around the globe who are slaving away on the LHC experiments. The information flow is expected to rise to 1.6 gigabytes per second, or roughly 1,000 times faster than your typical high-end cable Internet connection.
Fortunately, information technology has come a long way since the invention of the Internet (in the 1960s) and the Web (in 1990). You can tell by all the empty space you see in the Computing Center's air-conditioned hive, in the heart of CERN's campus on the French-Swiss border.
Thousands of computers and robotically controlled tape drives are humming right along, processing simulated data to prepare for the real thing. But there’s still enough empty floor space available to start up a bowling alley.
"We have this oversized computer center because it's from the '70s, when computers were huge,” Grey explained.
Even while information is coming in from the experiments for storage, Grey said, it will be going out via a 10-gigabit-per-second optical-fiber network to 11 Tier 1 computers around the globe (PDF file). Like a multilevel-marketing pyramid, the network is structured so that Tier 1 computers feed the data to more than 50 Tier 2 computers in various regions (PDF file). Those computers, in turn, distribute the data to the home institutions for all of the 7,000-plus collaborators in the LHC experiments.
"What's new here is that we're getting hundreds of organizations to share resources," Grey said. He added that the "human challenge is in a sense bigger than the technical challenge."
But Grey said the Grid is settling into place as CERN prepares to move from merely simulating the data load to sending out the real stuff. "The peak grumbling phase is over," he joked.
There are already moves afoot to expand Grid technology to other applications. The Open Science Grid and EGEE (Enabling Grids for E-sciencE) are among the first initiatives going beyond the LHC, Grey said. He foresees a day when climate modelers, genetics researchers, oil and gas prospectors and others who have to deal with large, dynamic data sets will get into Grids as well. "It'll be behind the scenes in Web services who use them," Grey said.
Many everyday Web users are already using different breeds of Grids, such as SETI @ Home, Einstein @ Home or Stardust @ Home. CERN itself is joining the club by offering LHC @ Home as a public project.
Just as the Web addressed the challenges of the 1990s, interlocking Grids will be called upon to address the challenges of the 21st century. "It's not about information management now. It's data processing and storage," Grey said.
But when you consider Grids for the common computer user, you have to remember the potential downside. Like the Web, the Grid can be used for good or evil: One could easily imagine the rise of a malevolent Grid - in fact, the zombies may already be taking over.
Add your own thoughts about the rise of the Grid in the comments section below - and for more information about the Grid's past, present and future, just drop in at CERN's Grid Cafe. As for me, I spent the weekend dropping in on cafes and tourist sites in Paris, but I'm due to get back on the road on Tuesday.