Cambridge researchers have outlined guidelines for making computational science more environmentally sustainable. Computational science drives discoveries that help us understand the human genome, fight cancer, and unlock the secrets of the universe, but it can also have a significant carbon footprint.
Researchers from the Department of Public Health and Primary Care at the University of Cambridge argue in a paper published in Nature Computational Science that the scientific community must take immediate action to prevent a potentially uncontrolled rise in the carbon footprint of computational science as data science and algorithms become more widely used.
Dr. Loïc Lannelongue, who is a research associate in biomedical data science and a postdoctoral associate at Jesus College, Cambridge, said: “Science has transformed our understanding of the world around us and has led to great benefits to society. But this has come with a not-insignificant and not always well understood impact on the environment.”
“As scientists as with people working in every sector it’s important that we do what we can to reduce the carbon footprint of our work to ensure that the benefits of our discoveries are not outweighed by their environmental costs.”
Recent study has started to investigate the environmental effects of scientific research, initially concentrating on scientific meetings and experimental labs. For example, the 2019 Fall Meeting of the American Geophysical Union was estimated to emit 80,000 tons of CO2e* (tCO2e), equivalent to the average weekly emissions of the city of Edinburgh, UK. The annual carbon footprint of a typical life science laboratory has been estimated to be around 20 tCO2e.
High performance and cloud computing, however, are two aspects of research that are frequently disregarded but which can have a significant environmental influence.
As scientists as with people working in every sector it’s important that we do what we can to reduce the carbon footprint of our work to ensure that the benefits of our discoveries are not outweighed by their environmental costs.
Dr. Loïc Lannelongue
In 2020, the Information and Communication Technologies sector was estimated to have made up between 1.8% and 2.8% of global greenhouse gas emissions more than aviation (1.9%). In addition to the consequences of hardware manufacturing, disposal, and electricity use on the environment, data centers’ water use and environmental impact are also issues.
Professor Michael Inouye said: “While the environmental impact of experimental ‘wet’ labs is more immediately obvious, the impact of algorithms is less clear and often underestimated. While new hardware, lower-energy data centres and more efficient high performance computing systems can help reduce their impact, the increasing ubiquity of artificial intelligence and data science more generally means their carbon footprint could grow exponentially in coming years if we don’t act now.”
To help address this issue, the team has developed GREENER (Governance, Responsibility, Estimation, Energy and embodied impacts, New collaborations, Education and Research), a set of principles to allow the computational science community to lead the way in sustainable research practices, maximising computational science’s benefit to both humanity and the environment.
Administration and accountability Everyone working in computational science has a part to play in advancing sustainability in the field; taking personal and institutional responsibility is a must for ensuring openness and lowering greenhouse gas emissions.
Institutions themselves, for instance, can play a crucial role in the management and growth of centralized data infrastructures as well as in ensuring that decisions on hardware purchases consider both the manufacturing and operational footprint. IT teams in high performance computing (HPC) centres can play a key role, both in terms of training and helping scientists monitor the carbon footprint of their work.
Principal Investigators can provide access to appropriate training and encourage their teams to consider this issue. By requiring estimates of carbon footprints to be submitted in funding applications, funding organizations can have an impact on researchers.
Calculate and document the energy usage of algorithms. Calculations’ carbon footprints can be estimated and tracked to find inefficiencies and areas for advancement.
Metrics at the user level are essential for comprehending environmental effects and encouraging personal responsibility. In academia, especially, the cost of conducting computations is frequently low, giving the impression of infinite computing power to scientists. By measuring each project’s carbon footprint, we can better understand the true costs associated with research.
Through new partnerships, energy and embodied impacts are being addressed One of the most effective strategies to reduce greenhouse gas emissions quickly is to lower carbon intensity, or the carbon footprint of producing power. This can entail moving computations to environments and nations with low carbon emissions, but this must be done with justice in mind.
Carbon intensities can differ by as much as three orders of magnitude between the top and bottom performing high-income countries (from 0.10 gCO2e/kWh in Iceland to 770 gCO2e/kWh in Australia).
The footprint of user devices is also a factor: one estimate found that almost three-quarters (72%) of the energy footprint of streaming a video to a laptop is from the laptop, with 23% used in transmission and a mere 5% at the data centre.
Another key consideration is data storage. Although there are many variables that affect the carbon footprint of data storage, the life cycle footprint of storing one terabyte of data for a year is in the range of 10 kg CO2e.
Such datasets are duplicated so that each university and occasionally each research group has a copy, which exacerbates the problem. Large (hyperscale) data centres are expected to be more energy efficient, but they may also encourage unnecessary increases in the scale of computing (the ‘rebound effect’).
Research and Education To increase awareness of the concerns among many stakeholders, education is crucial. A concrete first step toward lowering carbon footprints is incorporating sustainability into computational training programs. Funders and institutions must play a critical role in investing in research that will spark innovation in the area of environmentally sustainable computational science.
The most popular research programming languages, like R and Python, tend to be the least energy efficient ones, according to recent studies. This emphasizes the value of having trained Research Software Engineers within research groups to make sure that the algorithms used are efficiently implemented. Additionally, there is room for improvement in how present tools are used by better comprehending and observing how coding decisions affect carbon footprints.
Dr. Lannelongue said: “Computational scientists have a real opportunity to lead the way in sustainability, but this is going to involve a change in our culture and the ways we work. There will need to more transparency, more awareness, better training and resources, and improved policies.”
“Cooperation, open science, and equitable access to low-carbon computing facilities will also be crucial. We need to make sure that sustainable solutions work for everyone, as they frequently have the least benefit for populations, often in low and middle-income countries, who suffer the most from climate change.”
Professor Inouye added: “Everyone in the field from funders to journals to institutions down to individuals plays an important role and can, themselves, make a positive impact. We have an immense opportunity to make a change, but the clock is ticking.”
The research was a collaboration with major stakeholders including Health Data Research UK, EMBL-EBI, Wellcome and UK Research and Innovation (UKRI).