ArXiv TLDR

Measurement of Generative AI Workload Power Profiles for Whole-Facility Data Center Infrastructure Planning

🐦 Tweet
2604.07345

Roberto Vercellino, Jared Willard, Gustavo Campos, Weslley da Silva Pereira, Olivia Hull + 2 more

eess.SYcs.DCcs.LG

TLDR

This paper measures high-resolution power profiles of generative AI workloads and scales them to whole-facility data center energy demands for infrastructure planning.

Key contributions

  • Measures high-resolution (0.1s) power of AI training, fine-tuning, and inference on NVIDIA H100 GPUs.
  • Utilizes MLCommons and vLLM benchmarks for standardized and reproducible AI workload profiling.
  • Presents a methodology to scale workload power to whole-facility energy demand via an event-driven model.
  • Makes the dataset of measured AI workload power consumption profiles publicly available.

Why it matters

Generative AI's energy demands challenge data center planning due to a lack of public, high-resolution power data. This work provides a crucial methodology and public dataset, enabling accurate whole-facility energy modeling. This is vital for optimizing infrastructure, grid connections, and sustainable energy solutions.

Original Abstract

The rapid growth of generative artificial intelligence (AI) has introduced unprecedented computational demands, driving significant increases in the energy footprint of data centers. However, existing power consumption data is largely proprietary and reported at varying resolutions, creating challenges for estimating whole-facility energy use and planning infrastructure. In this work, we present a methodology that bridges this gap by linking high-resolution workload power measurements to whole-facility energy demand. Using NLR's high-performance computing data center equipped with NVIDIA H100 GPUs, we measure power consumption of AI workloads at 0.1-second resolution for AI training, fine-tuning and inference jobs. Workloads are characterized using MLCommons benchmarks for model training and fine-tuning, and vLLM benchmarks for inference, enabling reproducible and standardized workload profiling. The dataset of power consumption profiles is made publicly available. These power profiles are then scaled to the whole-facility-level using a bottom-up, event-driven, data center energy model. The resulting whole-facility energy profiles capture realistic temporal fluctuations driven by AI workloads and user-behavior, and can be used to inform infrastructure planning for grid connection, on-site energy generation, and distributed microgrids.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.