Mathematic

How the Pursuit of Mathematical Accuracy and Intricate Models Might Result in Fruitless Scientific Hypotheses

How the Pursuit of Mathematical Accuracy and Intricate Models Might Result in Fruitless Scientific Hypotheses

The idea that the cosmos is structured by a mathematical fact is a prevalent one in science. It is considered that the role of the scientist is to interpret these mathematical relations, which, if done so, can then be turned into mathematical models. The resulting “silicon reality” could then give us helpful information about how the world functions when it is run via a computer.

Models continue to expand as secrets are continually revealed by science. To more accurately reflect the world around us, they include discoveries and recently discovered systems. Several academics believe that as more accurate models are closer to reality, they produce sharper estimates and better predictions.

Yet, according to our recent study, which was published in Science Advances, they might have the opposite impact.

The assumption that “more detail is better” cuts across disciplinary fields. The ramifications are enormous. Universities purchase increasingly powerful computers because they want to run increasingly complex models, which call for greater processing power.

Recently, the European Commission invested €8bn euros (£6.9bn) to create a very detailed simulation of the Earth (with humans), dubbed a “digital twin,” hoping to better address current social and ecological challenges.

In our most recent study, we demonstrate that using ever-more sophisticated models as instruments to generate more precise estimates and predictions may not be effective. We ran hundreds of thousands of models with various configurations based on statistical theory and mathematical experiments and assessed the degree of uncertainty in their estimations.

We discovered that more complex models tended to produce more uncertain estimates. This is because new parameters and mechanisms are added. To measure a new parameter, such as the impact of chewing gum on the spread of a disease, measurement mistakes and uncertainty are inevitable. Additionally, when mathematically describing the same occurrence, modelers may utilize various equations.

These new inputs and the uncertainty they bring with them are added on top of the existing uncertainties once they are integrated into the model. Yet even while the model itself becomes more accurate, uncertainties continue to grow with each model upgrade, making the model output fuzzier along the way.

Any models that lack adequate training or validation data to compare their output accuracy to are affected by this. This covers all models that project future effects, including those for global climate change, hydrology (the flow of water), agricultural production, and epidemiological.

Fuzzy results

In 2009, engineers created an algorithm called Google Flu Trends for predicting the proportion of flu-related doctor visits across the US. Despite being based on 50 million queries that people had typed into Google, the model wasn’t able to predict the 2009 swine flu outbreak.

The engineers then made the model, which is no longer operating, even more complex. But it still wasn’t all that accurate. Research led by German psychologist Gerd Gigerenzer showed it consistently overestimated doctor visits in 2011–13, in some cases by more than 50%.

Gigerenzer discovered that a much simpler model could produce better results. Only one tiny piece of information the number of patients who saw a doctor the week before was included in his model to forecast weekly flu rates.

Global hydrological models, which monitor the flow and storage of water, are another illustration. They began simply in the 1960s based on “evapotranspiration processes” (the volume of water that may evaporate and transpire from a landscape covered in plants), but they quickly expanded to include worldwide water uses for domestic, industrial, and agricultural purposes. The next stage for these models is to represent the water demands for each kilometer of Earth’s surface each hour.

And yet one wonders if this extra detail will not just make them even more convoluted. We have demonstrated that it is possible to obtain estimates of the water utilized for irrigation produced by eight global hydrological models using just one parameter, the size of the irrigated area.

Ways forward

Why has the fact that more detail can make a model worse been overlooked until now? Modelers do not commonly use uncertainty and sensitivity analysis, techniques that show researchers how model uncertainties affect the final estimation. Many continue to add information without determining which model components are most responsible for the output’s uncertainty.

It is concerning because modelers are motivated to create ever-larger models; in fact, complicated models serve as the foundation for entire careers. That’s because they are harder to falsify: their complexity intimidates outsiders and complicates understanding what is going on inside the model.

There are remedies, however. We advise being cautious to avoid making models bigger and bigger just for the purpose of it. Even if scientists conduct a sensitivity and uncertainty analysis, their estimates run the danger of being too uncertain to be useful for science or policy. It makes little sense to spend a lot of money on computational resources only to run models whose estimates are wholly ambiguous.

Instead, modelers should consider how uncertainty grows as more complexity is added to the model and determine the appropriate balance between model detail and estimation uncertainty.

Using the notion of “effective dimensions,” which we define in our research as a measurement of the number of parameters that increase output uncertainty, one may determine this trade-off. This idea takes into account how these parameters interact with one another.

Modelers can assess if an increase in uncertainty still makes a model viable for policy or, alternatively, if it renders the model’s output so uncertain as to be useless by calculating a model’s effective dimensions after each update.

This improves transparency and aids scientists in creating models that benefit society and science more generally. The inclusion of greater model information, according to some modelers, can produce estimates that are more precise. The burden of proof now lies with them.