Carbon Footprint Quantification

The increasing supply of large datasets and machine-learning models has pushed the computational demand beyond Moore’s law [1,2]. The success of deep-learning models in the areas of image segmentation, classification and natural-language-processing (NLP) has enabled the development of novel applications in the field of neuroimaging towards image processing, clinical diagnosis and prognosis. So far, the research effort in this area has mainly focussed on achieving state-of-the-art task-accuracy via Monte-Carlo sampling of bigger and more complex model architectures. With the popularization of deep-learning approaches, it is important to take account of the compute costs and the consequent environmental impact of such model selection strategy [3,4]. The carbon footprint of training an AI model is estimated to be 284,000 Kgs (626,000 pounds) of CO2 (~5x lifetime emissions of a car or ~300x RT-flights for single passenger between NYC and SF [2,4]. Moreover, the energy consumption of the deep-learning based and the existing neuroimaging pipelines during deployment needs to be estimated in order to minimize the carbon emissions resulting from processing of big datasets, such as UKBiobank.

This working group aims to:


Benchmarks (Preliminary)

Dataset and model sizes over the years

Neuroimaging dataset sizes compiled from review articles by Madan 2021 and Thompson et al 2020.

Deep-learning model architectures. Compute costs depend on number of parameters and the floating point operations (FLOPs). FLOPs are calculated using this Pytorch flop-counter.

Dataset sizes Model sizes
Drawing Drawing

A conservative estimate of pipeline usage based on citations. The first pipeline is FreeSurfer - one of the most commonly used pipeline for structural MR imaging analysis. The second pipeline is FastSurfer - a recent novel deep-learning alternative for FreeSurfer tasks.

FreeSurfer citation counts based on Dale et al and Fischl et al papers in Scopus. ML citation count includes only MR neuroimaging studies in Ovid MEDLINE.

Drawing

Classification of compute costs

Several user-specific and infrastructure-specific factors contribute to the carbon footprint of neuroimaging pipelines.

Drawing

Compute costs of FreeSurfer vs FastSurfer

Image processing tasks part of FreeSurfer and FastSurfer pipelines:

Drawing

Compute cost metrics

  1. Runtime
  2. Power draw
  3. Carbon emissions

Compute cost tracker: experiment-impact-tracker

Note: The values in table are for processing of a single T1w MRI scan. A typical inference/deployment pipeline may incur over 10k of these runs for a large dataset. And a model training/development pipeline may incur over 1M runs.

Pipeline (single run) Runtime (hrs)   Power (W-hrs)   Carbon Emissions (grams)  
  CPU GPU CPU GPU CPU GPU
FreeSurfer 8.3 (1.03) N/A 108.5 (19.8) N/A 3.26 (0.5) N/A
FastSurfer 9.8 (0.74) 1.6 (0.47) 126.4 (16.1) 26.7 (7.7) 3.79 (0.5) 0.80 (0.2)

TL;DR


References:

  1. Amodei D, Hernandez D, Sastry G, Clark J, Brockman G, Sutskever I. AI and Compute. Published May 16, 2018.
  2. Hao, Karen. Training a single AI model can emit as much carbon as five cars in their lifetimes. MIT Technology Review (2019).
  3. Schwartz R, Dodge J, Smith NA, Etzioni O. Green AI. arXiv [csCY]. Published online July 22, 2019.
  4. Emma Strubell, Ananya Ganesh, and Andrew McCallum. Energy and policy considerations for deep learning in NLP. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 3645–3650, 2019.

Resources

Quantifying carbon footprint of neuroimaging pipelines is one of the key objectives of our workgroup. Some of these tools could be useful in developing a complete solution.

Further information

For details on benchmarking experiments contact: Nikhil Bhagwat