ArXiv TLDR

Enhancing Lyα Emitter Identification in HETDEX with a Convolutional Neural Network

🐦 Tweet
2604.12414

Shiro Mukae, Erin Mentuch Cooper, Karl Gebhardt, Dustin Davis, Lindsay R. House + 7 more

astro-ph.GAastro-ph.IM

TLDR

A CNN framework significantly improves the identification of low signal-to-noise Lyα emitters in the HETDEX survey, boosting cosmological analyses.

Key contributions

  • Developed a CNN to identify Lyα emitters (LAEs) in HETDEX, especially at low signal-to-noise.
  • Achieved 85.1% balanced accuracy in the low-S/N regime (4.8-5.5) using 2D spectral images.
  • Recovered 93% of low-S/N LAEs independently identified by DESI spectroscopy.
  • Enabled lowering the S/N threshold to 4.8, suppressing spurious spikes for cosmological analyses.

Why it matters

This paper introduces a robust deep learning method to overcome noise challenges in identifying faint Lyα emitters. By significantly improving the purity of the HETDEX catalog, it enhances the reliability of cosmological analyses. This approach demonstrates the power of specialized AI for untargeted spectroscopic surveys.

Original Abstract

We present a deep learning framework to enhance the identification of Ly$α$ emitters (LAEs) in the Hobby-Eberly Telescope Dark Energy Experiment (HETDEX), an untargeted spectroscopic survey of LAEs at $1.9 < z < 3.5$ without imaging pre-selection. We primarily address the low signal-to-noise ratio (S/N) regime ($4.8 \leq \mathrm{S/N} \leq 5.5$), where LAE candidates suffer from substantial noise contamination. To distinguish LAE candidates from artifacts and sky residuals, we employ a convolutional neural network (CNN) trained on two-dimensional spectral images of single emission lines. The training sample is constructed from the HETDEX COSMOS catalog, with external validation from ancillary observations and our participatory science project, \textit{Dark Energy Explorers}. For small-format, low-resolution spectroscopic data, the model achieves a balanced accuracy, precision, and recall of $94.1\%$, $97.5\%$, and $97.5\%$, respectively, in the high-S/N regime ($\mathrm{S/N}>5.5$), and $85.1\%$, $78.2\%$, and $84.4\%$ in the low-S/N regime. Using HETDEX LAEs independently identified by DESI spectroscopy, the model recovers $99\%$ and $93\%$ of the high- and low-S/N LAEs, respectively. Visual attribution indicates that the CNN attends to smooth, spatially extended central emission in true positives and to irregular or noisy features in true negatives. Applied to the full HETDEX catalog, the CNN enables an S/N threshold down to 4.8 by suppressing spurious spikes across $z\sim 1.9$--$2.5$ in the redshift distribution. Our approach facilitates HETDEX cosmological analyses by mitigating false positives in galaxy clustering and highlights the value of domain-specific deep learning for refining low-S/N spectroscopic identification in untargeted surveys.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.