Towards Affordable Energy: A Gymnasium Environment for Electric Utility Demand-Response Programs

May 12, 20262605.12462

Jose E. Aguilar Escamilla, Lingdong Zhou, Xiangqi Zhu, Huazheng Wang

cs.AIcs.CYcs.GTcs.LG

TLDR

DR-Gym is a new Gymnasium environment for training RL agents to optimize electric utility demand-response programs, improving grid flexibility and affordability.

Key contributions

Introduces DR-Gym, an open-source, online Gymnasium environment for electric utility demand-response.
Focuses on market-level utility perspective, unlike existing device-level energy simulators.
Features a regime-switching wholesale price model calibrated to real-world extreme events.
Includes physics-based building demand profiles and a configurable multi-objective reward function.

Why it matters

Demand response is crucial for grid flexibility and energy affordability but challenging to optimize with RL due to dynamic feedback. DR-Gym offers a realistic simulation environment for utilities to train RL agents, enabling better demand management. This can lead to more stable grids and reduced financial risks for residential consumers.

Original Abstract

Extreme weather and volatile wholesale electricity markets expose residential consumers to catastrophic financial risks, yet demand response at the distribution level remains an underutilized tool for grid flexibility and energy affordability. While a demand-response program can shield consumers by issuing financial credits during high-price periods, optimizing this sequential decision-making process presents a unique challenge for reinforcement learning despite the plentiful offline historical smart meter and wholesale pricing data available publicly. Offline historical data fails to capture the dynamic, interactive feedback loop between an electric utility's pricing signals and customer acceptance and adaptation to a demand-response program. To address this, we introduce DR-Gym, an open-source, online Gymnasium-compatible environment designed to train and evaluate demand-response from the electric utility's perspective. Unlike existing device-level energy simulators, our environment focuses on the market-level electric utility setting and provides a rich observational space relevant to the electric utility. The simulator additionally features a regime-switching wholesale price model calibrated to real-world extreme events, alongside physics-based building demand profiles. For our learning signal, we use a configurable, multi-objective reward function for specifying diverse learning objectives. We demonstrate through baseline strategies and data snapshots the capability of our simulator to create realistic and learnable environments.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers