ArXiv TLDR
← All categories

Genomics

Computational genomics, gene expression analysis, and DNA sequence modeling.

q-bio.GN · 53 papers

Transcriptomic Models for Immunotherapy Response Prediction Show Limited Cross-cohort Generalisability

This paper finds that current transcriptomic models for predicting immunotherapy response show limited cross-cohort generalisability and inconsistent biomarker signals.

2604.05478Apr 7, 2026Yuheng Liang, Lucy Chuo, Ahmadreza Argha +8

Entropy, Disagreement, and the Limits of Foundation Models in Genomics

High entropy in genomic sequences causes poor performance and instability in foundation models, suggesting self-supervised training limitations.

2604.04287Apr 5, 2026Maxime Rochkoulets, Lovro Vrček, Mile Šikić

An Imbalanced Dataset with Multiple Feature Representations for Studying Quality Control of Next-Generation Sequencing

A new imbalanced dataset with two distinct feature representations is introduced to improve quality control of next-generation sequencing data.

2604.04981Apr 4, 2026Philipp Röchner, Clarissa Krämer, Johannes U Mayer +3

Synonymous Codon Usage Bias Overrides Phylogeny to Reflect Convergent Frond Architecture in a Rapidly Radiating Fern Family Thelypteridaceae

Ferns show that synonymous codon usage bias (CUB) can override phylogeny, reflecting convergent frond architecture driven by specific photosynthesis genes.

2604.03028Apr 3, 2026Kerui Huang, Wenyan Zhao, Huan Li +15

High-dimensional Many-to-many-to-many Mediation Analysis

Introduces a high-dimensional many-to-many-to-many (MMM) mediation analysis framework for variable selection, effect estimation, and outcome prediction.

2604.02886Apr 3, 2026Tien Dat Nguyen, Trung Khang Tran, Cong Khanh Truong +3

Re-analysis of the Human Transcription Factor Atlas Recovers TF-Specific Signatures from Pooled Single-Cell Screens with Missing Controls

This paper re-analyzes the human TF Atlas, recovering robust TF-specific signatures from pooled single-cell screens despite missing internal controls.

2604.02511Apr 2, 2026Arka Jain, Umesh Sharma

QuantumXCT: Learning Interaction-Induced State Transformation in Cell-Cell Communication via Quantum Entanglement and Generative Modeling

QuantumXCT uses quantum entanglement and generative modeling to learn cell-cell communication as state transformations, moving beyond static ligand-receptor databases.

2604.02203Apr 2, 2026Selim Romero, Shreyan Gupta, Robert S. Chapkin +1

Benchmarking Heritability Estimation Strategies Across 86 Configurations and Their Downstream Effect on Polygenic Risk Score Performance

This study benchmarks 86 heritability estimation strategies, finding significant variability in estimates but surprisingly robust polygenic risk score performance.

2604.02394Apr 2, 2026Muhammad Muneeb, David B. Ascher

annbatch unlocks terabyte-scale training of biological data in anndata

Annbatch enables terabyte-scale biological data training by providing an out-of-core mini-batch loader for anndata, drastically speeding up ML workflows.

2604.01949Apr 2, 2026Ilan Gold, Felix Fischer, Lucas Arnoldt +2

VeloTree: Inferring single-cell trajectories from RNA velocity fields with varifold distances

VeloTree infers single-cell differentiation trees from RNA velocity fields using a novel varifold distance-based dissimilarity measure.

2604.02380Apr 1, 2026Elodie Maignant, Tim Conrad, Christoph von Tycowicz

Non-ignorable fuzziness in granular counts: the case of RNA-seq data

This paper shows that fuzzy counts in RNA-seq data lead to non-ignorable reporting mechanisms and introduces a hierarchical model to address this.

2604.00763Apr 1, 2026Antonio Calcagnì, Arianna Consiglio, Przemyslaw Grzegorzewski +1

Large Language Models for Variant-Centric Functional Evidence Mining

This paper introduces AcmGENTIC, an LLM-powered pipeline and benchmark for automating the extraction and classification of functional evidence for genomic variants.

2604.00075Mar 31, 2026Ali Saadat, Jacques Fellay

Genetic algorithms for multi-omic feature selection: a comparative study in cancer survival analysis

Sweeping*, a new multi-view genetic algorithm, improves multi-omic feature selection for cancer survival prediction by optimizing accuracy and biomarker set size.

2604.00065Mar 31, 2026Luca Cattelani, Vittorio Fortino
PreviousPage 3 of 3

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.