ArXiv TLDR

Homogeneous Stellar Parameters from Heterogeneous Spectra with Deep Learning

🐦 Tweet
2604.25786

Jeff Shen, Joshua S. Speagle, Shirley Ho

astro-ph.GAastro-ph.IM

TLDR

A deep learning Transformer unifies stellar parameters and chemical abundances from heterogeneous spectroscopic surveys, enabling consistent Galactic archaeology.

Key contributions

  • Introduces a deep learning Transformer to unify stellar parameters, abundances, distances, and ages from diverse surveys.
  • Processes heterogeneous spectra (optical to NIR, R 2k-28k) from APOGEE, GALAH, DESI, and Gaia RVS.
  • Achieves high precision (e.g., 18K Teff, 0.015 dex [Fe/H]) and ensures cross-survey consistency without post-hoc recalibration.

Why it matters

Current spectroscopic surveys use independent pipelines, leading to systematic offsets in stellar labels, hindering large-scale Galactic archaeology. This framework provides a single, self-consistent scale for stellar parameters across surveys, enabling unprecedentedly consistent and large-scale Galactic archaeology, crucial for understanding the Milky Way's evolution.

Original Abstract

Large-scale spectroscopic surveys have collectively observed millions of stars across the Milky Way, but each derives stellar labels using independent pipelines with distinct modelling assumptions, introducing systematic offsets that obscure signals in chemical space and hinder large-scale Galactic archaeology. We present a unified deep-learning framework that delivers atmospheric parameters, chemical abundances for 20 elements, distances, and ages -- all on a single, self-consistent scale -- for an arbitrary number of spectroscopic surveys simultaneously. Our approach uses a Transformer model that ingests spectra of arbitrary wavelength range and resolution, trained end-to-end as a single model across all surveys, eliminating the need for post-hoc recalibration. We apply this framework to spectra from APOGEE DR17, GALAH DR3, DESI DR1, and $\textit{Gaia}$ RVS DR3, spanning resolutions from R ~ 2,000 to 28,000 and wavelengths from the optical to the near-infrared. On high-resolution APOGEE spectra the model achieves precisions of $18~$K in $\textrm{T}_{\rm eff}$, $0.04~$dex in $\textrm{log}\,\textit{g}$, $0.015~$dex in [Fe/H], and ${<}\,0.03~$dex across all abundances; on lower-resolution DESI spectra, typical precisions are $51~$K, $0.09~$dex, $0.04~$dex, and ${\sim}\,0.06~$dex, respectively. Cross-survey comparisons demonstrate that labels for the same stars observed by different surveys are consistent within model uncertainties; we further validate against external distance catalogs and open cluster metallicities and ages. The resulting homogeneous catalog enables Galactic archaeology at unprecedented scale and consistency, and the framework is readily extensible to forthcoming spectroscopic surveys such as SDSS-V, WEAVE, and 4MOST. The catalog is publicly available at https://doi.org/10.5281/zenodo.19830515.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.