Genomics
Computational genomics, gene expression analysis, and DNA sequence modeling.
q-bio.GN ยท 53 papersA Combinatorial Optimisation Approach to Multi-factorial Gap-filling in Genome-scale Metabolic Models (GEMs)
This paper presents a metaheuristic combinatorial optimization method for multi-factorial gap-filling in GEMs, significantly improving accuracy over prior approaches.
T-cell repertoire response in individuals with post-acute sequelae of COVID-19
This study identifies distinct T-cell signatures and over 1,000 candidate TCRs associated with Post-Acute Sequelae of COVID-19 (PASC).
CMGL: Confidence-guided Multi-omics Graph Learning for Cancer Subtype Classification
CMGL is a two-stage framework that uses evidential deep learning to estimate per-sample modality reliability, improving cancer subtype classification.
Imaging Exploration of Molecular Subtypes in Tongue Squamous Cell Carcinoma
This study shows radiomic features can non-invasively distinguish molecular subtypes in tongue squamous cell carcinoma (TSCC), offering a new diagnostic tool.
The Cathaya argyrophylla Genome Reveals the Evolutionary Trade-offs of a Living Fossil
A new 22.73 Gb genome for the endangered Cathaya argyrophylla reveals its gigantism, ancient divergence, and genomic trade-offs between resource adaptation and weak immunity.
Supregraph: Enabling Information-Optimal Assembly Graph Representation of a Read Set
Supregraphs offer an information-optimal assembly graph representation, overcoming limitations of de Bruijn and overlap graphs for genome assembly.
TorchGWAS : GPU-accelerated GWAS for thousands of quantitative phenotypes
TorchGWAS leverages GPU acceleration to perform genome-wide association studies on thousands of quantitative phenotypes, drastically speeding up analysis.
Conditional Monte Carlo Tree Diffusion for Designing Cell-Type-Specific and Biologically Faithful Regulatory DNA
DNA-CRAFT uses conditional Monte Carlo tree diffusion to design highly cell-type-specific and biologically faithful regulatory DNA elements.
Quantum AI for Cancer Diagnostic Biomarker Discovery
Quantum AI identifies lung cancer biomarkers and classifies subtypes, demonstrating quantum advantage in diagnostics and multiomic data processing.
Geometric coherence of single-cell CRISPR perturbations reveals regulatory architecture and predicts cellular stress
Shesha quantifies geometric coherence in single-cell CRISPR screens, revealing regulatory architecture and predicting cellular stress responses.
Combining Bayesian and Frequentist Inference for Laboratory-Specific Performance Guarantees in Copy Number Variation Detection
A hybrid Bayesian-frequentist method provides accurate, lab-specific performance guarantees for copy number variation detection in oncology panels.
oxo-call: Documentation-grounded Skill Augmentation for Accurate Bioinformatics Command-line Generation with Large Language Models
oxo-call is a Rust-based LLM assistant that generates accurate bioinformatics command-line invocations using documentation grounding and expert skill augmentation.
Interpretable DNA Sequence Classification via Dynamic Feature Generation in Decision Trees
DEFT uses large language models to dynamically generate interpretable, high-level features for DNA sequence classification in decision trees.
EvoLen: Evolution-Guided Tokenization for DNA Language Model
EvoLen introduces an evolution-guided tokenization method for DNA language models, improving the preservation of functional sequence patterns and DNALM performance.
Probing 3D Chromatin Structure Awareness in Evo2 DNA Language Model
Evo2 DNA language model learns local CTCF grammar but fails to grasp higher-order 3D chromatin organization, suggesting new architectures are needed.
WebCVTree4: A Newly Designed Phylogenetic and Taxonomic Study Platform for Prokaryotes Using Composition Vectors and Whole Genomes
WebCVTree4 is an upgraded web platform for prokaryotic phylogenetic and taxonomic studies using whole-genome composition vectors, supporting large-scale analysis.
ECLIPSE: A Composable Pipeline for Predicting ecDNA Formation, Evolution, and Therapeutic Vulnerabilities in Cancer
ECLIPSE is a robust computational framework for predicting ecDNA formation, evolution, and therapeutic vulnerabilities in cancer, addressing prior methodological flaws.
The Mechanistic Invariance Test: Genomic Language Models Fail to Learn Positional Regulatory Logic
Genomic language models fail to learn positional gene regulation, instead relying on statistical shortcuts like AT content, despite high performance.
PhageBench: Can LLMs Understand Raw Bacteriophage Genomes?
PhageBench evaluates LLMs' ability to understand raw bacteriophage genomes, showing promise but also limitations in complex tasks.
GenomeQA: Benchmarking General Large Language Models for Genome Sequence Understanding
GenomeQA is a new benchmark evaluating general LLMs on raw genome sequence understanding, revealing their ability to use local signals but struggle with complex inference.
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.