ArXiv TLDR
← All categories

Genomics

Computational genomics, gene expression analysis, and DNA sequence modeling.

q-bio.GN · 53 papers

A Resampling-Based Framework for Network Structure Learning in High-Dimensional Data

RSNet is an R package for robust, interpretable network inference in high-dimensional data, using resampling and graphlet analysis for structural insights.

2605.12706May 12, 2026Ziwei Huang, Zeyuan Song, Paola Sebastiani +1

scShapeBench: Discovering geometry from high dimensional scRNAseq data

scShapeBench introduces a benchmark and scReebTower, a new method for automated shape detection in high-dimensional scRNAseq data, outperforming baselines.

2605.12662May 12, 2026Andrew J Steindl, João Felipe Rocha, Brian Tshilengi Di Bassinga +13

Set-Aggregated Genome Embeddings for Microbiome Abundance Prediction

This paper uses Set-Aggregated Genome Embeddings (SAGE) with genomic language models to predict microbiome abundance from DNA, showing improved generalization.

2605.12286May 12, 2026Younhun Kim, Georg K. Gerber, Travis E. Gibson

LPDP: Inference-Time Reward Control for Variable-Length DNA Generation with Edit Flows

LPDP enables training-free, inference-time reward control for variable-length DNA generation using biologically plausible edit flows.

2605.11368May 12, 2026Jeongchan Kim, Yunkyung Ko, Jong Chul Ye

SCOPE: Siamese Contrastive Operon Pair Embeddings for Functional Sequence Representation and Classification

SCOPE introduces a Siamese MLP with protein language model embeddings for scalable operon pair classification, achieving competitive ROC-AUC.

2605.11022May 10, 2026Akarsh Gupta, Kenneth Rodrigues, Sagnik Chatterjee

MicroFuse: Protein-to-Genome Expert Fusion for Microbial Operon Reasoning

MicroFuse integrates protein and genome context using a Mixture-of-Experts model to accurately predict microbial operons, outperforming baselines.

2605.08815May 9, 2026Seungik Cho

A Linear-Transformer Hybrid for SNP-Based Genotype-to-Phenotype Prediction in Grapevine

LiT-G2P, a linear-Transformer hybrid, improves genotype-to-phenotype prediction in grapevines, enhancing breeding decisions and genetic gain.

2605.06762May 7, 2026Yibin Wang, Murukarthick Jayakodi, Silvas Kirubakaran +2

Feature Dimensionality Outweighs Model Complexity in Breast Cancer Subtype Classification Using TCGA-BRCA Gene Expression Data

This study shows that feature dimensionality is more critical than model complexity for breast cancer subtype classification, with logistic regression excelling.

2605.06562May 7, 2026Meena Al Hasani

A Versatile AI Agent for Rare Disease Diagnosis and Risk Gene Prioritization

Hygieia is a versatile AI agent that integrates multi-modal data for accurate rare disease diagnosis and risk gene prioritization, outperforming physicians.

2605.06226May 7, 2026Tianyu Liu, Wangjie Zheng, Rui Yang +12

OmicsLM: A Multimodal Large Language Model for Multi-Sample Omics Reasoning

OmicsLM is a multimodal LLM that connects quantitative omics data with natural language for biological reasoning, outperforming existing models.

2605.06728May 7, 2026Maciej Sypetkowski, Joanna Krawczyk, Łukasz Smoliński +4

When Does Gene Regulatory Network Inference Break? A Controlled Diagnostic Study of Causal and Correlational Methods on Single-Cell Data

This paper diagnoses why causal gene regulatory network inference methods often fail, revealing they excel in clean data but are vulnerable to specific pathologies.

2605.04930May 6, 2026Miguel Fernandez-de-Retana, Ruben Sanchez-Corcuera, Unai Zulaika +2

Statistics of a multi-factor function from its Fourier transform

A new theorem enables deriving multi-factor function statistics from its Fourier transform, revealing hidden relationships via index annihilation.

2605.02248May 4, 2026Matthew A. Herman, Stephen Doro

ORBIT: Learning Gene Program Co-Activation Structure for Cell-Type-Stratified Pathway Rewiring Analysis in Single-Cell Transcriptomics

ORBIT is a self-supervised transformer that learns asymmetric gene program dependencies from single-cell RNA-seq, revealing cell-type-specific pathway rewiring.

2605.02142May 4, 2026Yuechen Wang, Lina Jia, Qinglong Wang +1

EFGPP: Exploratory framework for genotype-phenotype prediction

EFGPP is a reproducible framework that integrates diverse genetic and clinical data to improve complex human trait prediction, demonstrated on migraine.

2605.02954May 2, 2026Muhammad Muneeb, David B. Ascher

PhenotypeToGeneDownloaderR: automated multi-source retrieval and validation of phenotype-associated genes

PhenotypeToGeneDownloaderR automates multi-source retrieval, validation, and harmonization of phenotype-associated genes for downstream analysis.

2605.01378May 2, 2026Muhammad Muneeb, David B. Ascher

Beyond Continuity: Simulation-free Reconstruction of Discrete Branching Dynamics from Single-cell Snapshots

Unbalanced Schrödinger Bridge (USB) reconstructs discrete branching cell dynamics from snapshots, integrating stochastic and birth-death events.

2605.00545May 1, 2026Junda Ying, Yuxuan Wang, Bowen Yang +2

CellxPert: Inference-Time MCMC Steering of a Multi-Omics Single-Cell Foundation Model for In-Silico Perturbation

CellxPert is a multi-omics single-cell foundation model using MCMC for biologically interpretable in-silico perturbation and superior performance.

2605.00930Apr 30, 2026Andac Demir, Erik W. Anderson, Jeremy L. Jenkins +1

CRC-Screen: Certified DNA-Synthesis Hazard Screening Under Taxonomic Shift

CRC-Screen offers certified DNA-synthesis hazard screening, maintaining low miss and false-flag rates even under taxonomic shifts.

2605.00074Apr 30, 2026Najmul Hasan

Hyper Input Convex Neural Networks for Shape Constrained Learning and Optimal Transport

Introducing HyCNNs, a novel neural network architecture for learning convex functions, combining Maxout and ICNNs for better efficiency and performance.

2604.26942Apr 29, 2026Shayan Hundrieser, Insung Kong, Johannes Schmidt-Hieber

Robust Clustering Analysis of Genes Related to Age-related Macular Degeneration using RNA-Seq

This paper presents a robust gene clustering analysis of Age-related Macular Degeneration (AMD) RNA-Seq data, identifying novel and known hub genes.

2604.25986Apr 28, 2026Brayan Gutierrez, Rinki Ratnapriya, Arko Barman
Page 1 of 3Next

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.