Astronomy

The Secret Equation for ‘Weighing’ Galaxy Clusters Discovered by Artificial Intelligence

The Secret Equation for ‘Weighing’ Galaxy Clusters Discovered by Artificial Intelligence

Astrophysicists from the Institute for Advanced Study, the Flatiron Institute, and its colleagues used artificial intelligence (AI) to discover a more accurate method of estimating the mass of massive clusters of galaxies. The AI discovered that scientists can get considerably better mass estimates than they previously had by simply adding a simple element to an existing equation.

The revised estimations will allow scientists to determine the fundamental features of the cosmos more precisely, astrophysicists revealed in the Proceedings of the National Academy of Sciences on March 17, 2023.

“It’s such a simple thing; that’s the beauty of this,” says study co-author Francisco Villaescusa-Navarro, a research scientist at the Flatiron Institute’s Center for Computational Astrophysics (CCA) in New York City. “Even though it’s so simple, nobody before found this term. People have been working on this for decades, and still they were not able to find this.”

The work was led by Digvijay Wadekar of the Institute for Advanced Study in Princeton, New Jersey, along with researchers from the CCA, Princeton University, Cornell University and the Center for Astrophysics | Harvard & Smithsonian.

Understanding the universe requires knowing where and how much stuff there is. Galaxy clusters are the most massive objects in the universe: A single cluster can contain hundreds to thousands of galaxies as well as plasma, hot gas, and dark matter. The gravity of the cluster holds these components together. Understanding such galaxy clusters is critical to determining the origin and evolution of the cosmos.

The total mass of a galaxy cluster is perhaps the most important quantity in defining its attributes. However, determining this quantity is difficult because galaxies cannot be ‘weighed’ by placing them on a scale.

The problem is further complicated because the dark matter that makes up much of a cluster’s mass is invisible. Instead, scientists deduce the mass of a cluster from other observable quantities.

In a lot of cases in astronomy, people make a linear fit between two parameters and ignore everything else. But nowadays, with these tools, you can go further. Symbolic regression and other artificial intelligence tools can help us go beyond existing two-parameter power laws in a variety of different ways, ranging from investigating small astrophysical systems like exoplanets, to galaxy clusters, the biggest things in the universe.

Digvijay Wadekar

In the early 1970s, Rashid Sunyaev, current distinguished visiting professor at the Institute for Advanced Study’s School of Natural Sciences, and his collaborator Yakov B. Zel’dovich developed a new way to estimate galaxy cluster masses. Their method is based on the notion that as gravity squashes stuff together, the electrons in the substance push back.

The electron pressure changes how electrons interact with light particles known as photons. When photons from the Big Bang’s afterglow collide with the squeezed material, the interaction produces new photons.

The properties of the photons are determined by how strongly gravity compresses the material, which is determined by the mass of the galaxy cluster. Astrophysicists can estimate the mass of the cluster by detecting the photons.

However, because the changes in photon properties vary depending on the galaxy cluster, this ‘integrated electron pressure’ is not a perfect proxy for mass. Wadekar and his colleagues thought an artificial intelligence tool called ‘symbolic regression’ might find a better approach.

To determine which equation best matches the data, the program essentially tries out numerous combinations of mathematical operators such as addition and subtraction with various variables.

Wadekar and his collaborators ‘fed’ their AI program a state-of-the-art universe simulation containing many galaxy clusters. Next, their program, written by CCA research fellow Miles Cranmer, searched for and identified additional variables that might make the mass estimates more accurate.

AI is useful for identifying new parameter combinations that human analysts might overlook. While human analysts can easily identify two significant parameters in a dataset, AI can better parse through large volumes, often revealing unexpected influencing factors.

“Right now, a lot of the machine-learning community focuses on deep neural networks,” Wadekar explained. “These are very powerful, but the drawback is that they are almost like a black box. We cannot understand what goes on in them. In physics, if something is giving good results, we want to know why it is doing so. Symbolic regression is beneficial because it searches a given dataset and generates simple mathematical expressions in the form of simple equations that you can understand. It provides an easily interpretable model.”

By adding a single new element to the old equation, the researchers’ symbolic regression program generated a new equation that might better forecast the mass of the galaxy cluster.

Wadekar and his collaborators then worked backward from this AI-generated equation and found a physical explanation. They discovered that gas concentration coincides with parts of galaxy clusters where mass inferences are less trustworthy, such as galaxies’ centres, which contain supermassive black holes. Their new equation improved mass inferences by downplaying the importance of those complex cores in the calculations. In a sense, the galaxy cluster is like a spherical doughnut.

The new equation eliminates the jelly in the doughnut’s center, which might generate bigger mistakes, and instead focuses on the doughy fringes for more reliable mass inferences.

The researchers tested the AI-discovered equation on thousands of simulated universes from the CCA’s CAMELS suite. When compared to the currently used equation, they discovered that the equation reduced the variability in galaxy cluster mass estimates by around 20 to 30 percent for large clusters.

The new equation will help observational astronomers participating in next galaxy cluster surveys better understand the mass of the objects they encounter.

“There are quite a few surveys targeting galaxy clusters that are planned in the near future,” Wadekar noted. “Examples include the Simons Observatory, the Stage 4 CMB experiment and an X-ray survey called eROSITA. The new equations can help us in maximizing the scientific return from these surveys.”

Wadekar also hopes that this publication will be just the tip of the iceberg when it comes to using symbolic regression in astrophysics.

“We think that symbolic regression is highly applicable to answering many astrophysical questions,” he said. “In a lot of cases in astronomy, people make a linear fit between two parameters and ignore everything else. But nowadays, with these tools, you can go further. Symbolic regression and other artificial intelligence tools can help us go beyond existing two-parameter power laws in a variety of different ways, ranging from investigating small astrophysical systems like exoplanets, to galaxy clusters, the biggest things in the universe.”