A self-evolving agent for explainable diagnosis of DFT-experiment band-gap mismatch

April 29, 20262604.26703

cond-mat.mtrl-scics.AIphysics.comp-ph

TLDR

XDFT is a self-evolving agent that automatically diagnoses and resolves band-gap mismatches in DFT calculations, significantly improving material prediction accuracy.

Key contributions

Introduces XDFT, an agent for automated diagnosis of DFT band-gap mismatches.
Employs a Bayesian learning approach to test hypotheses and update usefulness.
Resolves 78% of mismatch cases on a benchmark, outperforming random and LLM baselines.
Provides corrected DFT protocols and mechanistic explanations for diagnosed materials.

Why it matters

Standard DFT often misclassifies materials, leading to discrepancies with experimental results. XDFT automates the complex, manual process of diagnosing these mismatches, saving significant research time. This tool enhances the reliability of computational materials science by providing explainable resolutions and improved protocols.

Original Abstract

Standard density functional theory (DFT) routinely misclassifies the electronic ground state of correlated and structurally complex compounds, predicting metallic behaviour for materials that experiments report as semiconductors. Each such mismatch encodes a specific non-ideality -- magnetic ordering, electron correlation, an alternative polymorph, or a defect -- that the calculation excluded, but extracting that signal at scale has remained a manual exercise. Here we introduce XDFT, a closed-loop agent that diagnoses the mismatch automatically: it draws candidate hypotheses from a curated catalogue, executes the corresponding first-principles tests, and updates a global Bayesian posterior over hypothesis usefulness from each verdict. On a verified benchmark of 124 materials, XDFT identifies a resolving mechanism for 70 of 90 mismatch cases (78\%), an order of magnitude above a uniform-random baseline (19\%) and a static LLM ordering (20\%). The internal posterior aligns with empirical performance over the benchmark timeline, and resolved cases collapse into a tri-partite element-class taxonomy that we distil into a four-line static rule. Each diagnosed material is returned with a corrected protocol and a mechanistic attribution; failed cases are flagged as evidence-backed targets for experimental re-examination.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers