Essentially assumption driven AI is Changing Science
The approach is related to traditional simulation, but with critical differences. A simulation is “essentially assumption-driven,” Schawinski said. “The approach is to say, ‘I think I know what the underlying physical laws are that give rise to everything that I see in the system.’ So I have a recipe for star formation, I have a recipe for how dark matter behaves, and so on. I put all of my hypotheses in there, and I let the simulation run. And then I ask: Does that look like reality?” What he’s done with generative modeling, he said, is “in some sense, exactly the opposite of a simulation. We don’t know anything; we don’t want to assume anything. We want the data itself to tell us what might be going on.”
The apparent success of generative modeling in a study like this obviously doesn’t mean that astronomers and graduate students have been made redundant but it appears to represent a shift in the degree to which learning about astrophysical objects and processes can be achieved by an artificial system that has little more at its electronic fingertips than a vast pool of data. “It’s not fully automated science but it demonstrates that we’re capable of at least in part building the tools that make the process of science automatic,” Schawinski said.
Generative modeling is clearly powerful, but whether it truly represents a new approach to science is open to debate. For David Hogg, a cosmologist at New York University and the Flatiron Institute (which, like Quanta, is funded by the Simons Foundation), the technique is impressive but ultimately just a very sophisticated way of extracting patterns from data which is what astronomers have been doing for centuries. In other words, it’s an advanced form of observation plus analysis. Hogg’s own work, like Schawinski’s, leans heavily on AI; he’s been using neural networks to classify stars according to their spectra and to infer other physical attributes of stars using data-driven models. But he sees his work, as well as Schawinski’s, as tried-and-true science. “I don’t think it’s a third way,” he said recently. “I just think we as a community are becoming far more sophisticated about how we use the data. In particular, we are getting much better at comparing data to data. But in my view, my work is still squarely in the observational mode.”
Whether they’re conceptually novel or not, it’s clear that AI and neural networks have come to play a critical role in contemporary astronomy and physics research. At the Heidelberg Institute for Theoretical Studies, the physicist Kai Polsterer heads the astroinformatics group — a team of researchers focused on new, data-centered methods of doing astrophysics. Recently, they’ve been using a machine-learning algorithm to extract redshift information from galaxy data sets, a previously arduous task.
Polsterer sees these new AI-based systems as “hardworking assistants” that can comb through data for hours on end without getting bored or complaining about the working conditions. These systems can do all the tedious grunt work, he said, leaving you “to do the cool, interesting science on your own.”
But they’re not perfect. In particular, Polsterer cautions, the algorithms can only do what they’ve been trained to do. The system is “agnostic” regarding the input. Give it a galaxy, and the software can estimate its redshift and its age but feed that same system a selfie, or a picture of a rotting fish, and it will output a (very wrong) age for that, too. In the end, oversight by a human scientist remains essential, he said. “It comes back to you, the researcher. You’re the one in charge of doing the interpretation.”
For his part, Nord, at Fermilab, cautions that it’s crucial that neural networks deliver not only results, but also error bars to go along with them, as every undergraduate is trained to do. In science, if you make a measurement and don’t report an estimate of the associated error, no one will take the results seriously, he said.
Like many AI researchers, Nord is also concerned about the impenetrability of results produced by neural networks; often, a system delivers an answer without offering a clear picture of how that result was obtained.
Yet not everyone feels that a lack of transparency is necessarily a problem. Lenka Zdeborová, a researcher at the Institute of Theoretical Physics at CEA Saclay in France, points out that human intuitions are often equally impenetrable. You look at a photograph and instantly recognize a cat “but you don’t know how you know,” she said. “Your own brain is in some sense a black box.”
It’s not only astrophysicists and cosmologists who are migrating toward AI-fueled, data-driven science. Quantum physicists like Roger Melko of the Perimeter Institute for Theoretical Physics and the University of Waterloo in Ontario have used neural networks to solve some of the toughest and most important problems in that field, such as how to represent the mathematical “wave function” describing a many-particle system. AI is essential because of what Melko calls “the exponential curse of dimensionality.” That is, the possibilities for the form of a wave function grow exponentially with the number of particles in the system it describes. The difficulty is similar to trying to work out the best move in a game like chess or Go: You try to peer ahead to the next move, imagining what your opponent will play, and then choose the best response, but with each move, the number of possibilities proliferates.
Of course, AI systems have mastered both of these games chess, decades ago, and Go in 2016, when an AI system called AlphaGo defeated a top human player. They are similarly suited to problems in quantum physics, Melko says.
The Mind of the Machine
Whether Schawinski is right in claiming that he’s found a “third way” of doing science, or whether, as Hogg says, it’s merely traditional observation and data analysis “on steroids,” it’s clear AI is changing the flavor of scientific discovery, and it’s certainly accelerating it. How far will the AI revolution go in science?
Occasionally, grand claims are made regarding the achievements of a “robo-scientist.” A decade ago, an AI robot chemist named Adam investigated the genome of baker’s yeast and worked out which genes are responsible for making certain amino acids. (Adam did this by observing strains of yeast that had certain genes missing, and comparing the results to the behavior of strains that had the genes.) Wired’s headline read, “Robot Makes Scientific Discovery All by Itself.”
More recently, Lee Cronin, a chemist at the University of Glasgow, has been using a robot to randomly mix chemicals, to see what sorts of new compounds are formed. Monitoring the reactions in real-time with a mass spectrometer, a nuclear magnetic resonance machine, and an infrared spectrometer, the system eventually learned to predict which combinations would be the most reactive. Even if it doesn’t lead to further discoveries, Cronin has said, the robotic system could allow chemists to speed up their research by about 90 percent.
Last year, another team of scientists at ETH Zurich used neural networks to deduce physical laws from sets of data. Their system, a sort of robo-Kepler, rediscovered the heliocentric model of the solar system from records of the position of the sun and Mars in the sky, as seen from Earth, and figured out the law of conservation of momentum by observing colliding balls. Since physical laws can often be expressed in more than one way, the researchers wonder if the system might offer new ways perhaps simpler ways of thinking about known laws.
These are all examples of AI kick-starting the process of scientific discovery, though in every case, we can debate just how revolutionary the new approach is. Perhaps most controversial is the question of how much information can be gleaned from data alone a pressing question in the age of stupendously large (and growing) piles of it. In The Book of Why (2018), the computer scientist Judea Pearl and the science writer Dana Mackenzie assert that data are “profoundly dumb.” Questions about causality “can never be answered from data alone,” they write. “Anytime you see a paper or a study that analyzes the data in a model-free way, you can be certain that the output of the study will merely summarize, and perhaps transform, but not interpret the data.” Schawinski sympathizes with Pearl’s position, but he described the idea of working with “data alone” as “a bit of a straw man.” He’s never claimed to deduce cause and effect that way, he said. “I’m merely saying we can do more with data than we often conventionally do.”
Another oft-heard argument is that science requires creativity, and that at least so far we have no idea how to program that into a machine. (Simply trying everything, like Cronin’s robo-chemist, doesn’t seem especially creative.) “Coming up with a theory, with reasoning, I think demands creativity,” Polsterer said. “Every time you need creativity, you will need a human.” And where does creativity come from? Polsterer suspects it is related to boredom — something that, he says, a machine cannot experience. “To be creative, you have to dislike being bored. And I don’t think a computer will ever feel bored.” On the other hand, words like “creative” and “inspired” have often been used to describe programs like Deep Blue and AlphaGo. And the struggle to describe what goes on inside the “mind” of a machine is mirrored by the difficulty we have in probing our own thought processes.
Schawinski recently left academia for the private sector; he now runs a startup called Modulos which employs a number of ETH scientists and, according to its website, works “in the eye of the storm of developments in AI and machine learning.” Whatever obstacles may lie between current AI technology and full-fledged artificial minds, he and other experts feel that machines are poised to do more and more of the work of human scientists. Whether there is a limit remains to be seen.
“Will it be possible, in the foreseeable future, to build a machine that can discover physics or mathematics that the brightest humans alive are not able to do on their own, using biological hardware?” Schawinski wonders. “Will the future of science eventually necessarily be driven by machines that operate on a level that we can never reach? I don’t know. It’s a good question.”