IE 11 is not supported. For an optimal experience visit our site on another browser.

Astronomy Overload: Scientists Shifting From Stargazing to Data Mining

SAN FRANCISCO A tidal wave of data has begun crashing over astronomers' heads, and they'll have to up their game to avoid being swamped.
/ Source:

SAN FRANCISCO A tidal wave of data has begun crashing over astronomers' heads, and they'll have to up their game to avoid being swamped.

Astronomers have already shifted to a more passive role, said astronomer Joshua Bloom of the University of California, Berkeley. Ever since digital photography came on the astronomical scene a few decades ago, they've been spending less time gazing at the heavens and more time combing through databases, he added.

And that trend is only going to accelerate, as more advanced telescopes haul in ever-increasing mountains of data.

The future lies in training computers to recognize when a telescope has picked up something new and interesting, Bloom said during an Oct. 4 lecture here at the California Academy of Sciences.

"The data rates are going to preclude human involvement," he said. "There's just too much of the stuff."

Changing how we observe the sky

For hundreds of years, Bloom said, the most difficult and time-consuming part of astronomy was "discovery" finding new cosmic phenomena.

In the past, astronomers could only gather so much data. They looked through small, crude telescopes and scribbled notes, like Galileo. Or, like the skywatchers of the early 20th century, they pored over photographic plates that captured what telescopes saw.

But that began to change in the mid-1980s with the rise of digital photography. Astronomers gained the ability to gather and store huge piles of data. These piles continue to grow as telescopes get more advanced, more autonomous and more sensitive, Bloom said.

As a result, the role of the astronomer has changed. The bottleneck is shifting from "discovery" to data management there is no shortage of intriguing, possibly new phenomena being discovered, but astronomers must find a way to follow up on each potentially interesting observation.

1.5 million new observations every night

As an example, Bloom discussed his own work with the Palomar Transient Factory, a project that's mapping the sky using a telescope at the Palomar Observatory in southern California.

Every night, Bloom said, this one telescope picks up 1.5 million candidate transients fleeting astronomical phenomena in the sky. Ten thousand or so of these are bona fide objects, and about 10 turn out to be new. Finding a few needles in a giant haystack night after night is a relatively new challenge for astronomers, Bloom said.

"How do we go through and just find new things?" he said. "This gets into very interesting realms of computer science and statistics."

Bloom and his colleagues have developed intelligent algorithms to do the job. Using a pool of 30 or so experts, they created baseline "good images" of actual transients to train the computers. The algorithms work off these baseline criteria as they crawl through the data pile every night.

"This has allowed us to drill down into the data in a new way," Bloom said.

Other researchers are taking a similar machine-learning tack. Astronomers in the United Kingdom developed algorithms that can classify galaxies as spiral or elliptical, for instance by using Galaxy Zoo judgments as a guide.

Galaxy Zoo is an online project that enlists the public to classify millions of galaxies imaged by the Hubble Space Telescope. So far, more than 250,000 people have taken part.

The future

Bloom said he and his team are confident in their algorithms' ability to handle the Palomar Transient Factory's nightly data deluge. But such machine-learning efforts are just a prelude to what will be necessary in the near future, when even more powerful instruments come online.

Bloom cited the Large Synoptic Survey Telescope, which should start scanning the sky from atop a Chilean mountain in 2018 or so.

The LSST will collect one gigabyte of data every two seconds, Bloom said. It has the potential to spot one million supernovae and 10 million asteroids every year, in addition to all sorts of other, more spectacular phenomena.

For example, the instrument could pick up evidence of colliding neutron stars, which would likely give astronomers the first direct confirmation of the existence of ripples in space-time called gravitational waves, Bloom said. It could also help researchers better understand mysterious dark matter and dark energy.

The challenge will be for astronomers to sift through and follow up on the instrument's observations. Researchers are just now figuring out how they might be able to tackle such a job, according to Bloom.

"We're now in a very fun stage in this process," he said. "We're playing around with data, seeing which algorithms give us the best description of the data."

Over the long haul, universities and colleges need to train future astronomers more fully in computer science and data management, Bloom said. And in the short term, astronomers should team up with computer scientists. Bloom's team has opened a dialogue with Google, for example, in the hope of learning from the company's search expertise.

While computer scientists have migrated into other fields that have experienced data deluges recently such as molecular biology Bloom said that movement hasn't happened on a large scale in astronomy yet.

But he's confident that astronomers will gain the knowledge through training and collaboration to handle the huge streams of data that have already begun pouring down on them. And it's important to remember that having too much data is a good problem to have.

"It's a very good thing," Bloom said. "It's just new to astronomers. We're not used to it."