Non-ignorable fuzziness in granular counts: the case of RNA-seq data

April 1, 20262604.00763

Antonio Calcagnì, Arianna Consiglio, Przemyslaw Grzegorzewski, Corrado Mencar

stat.MEq-bio.GNstat.AP

TLDR

This paper shows that fuzzy counts in RNA-seq data lead to non-ignorable reporting mechanisms and introduces a hierarchical model to address this.

Key contributions

RNA-seq data often contains "granular counts" due to read-to-gene alignment ambiguity.
Demonstrates that fuzzy reporting mechanisms lead to non-ignorable "coarsening-not-at-random" structures.
Proposes a tractable hierarchical model to address and illustrate this non-ignorable fuzziness in RNA-seq.

Why it matters

This paper identifies non-ignorable fuzziness in RNA-seq data, which can bias results if not accounted for. It introduces a novel hierarchical model to correctly handle this ambiguity, improving transcriptomics study reliability.

Original Abstract

RNA-seq count data are often affected by read-to-gene alignment ambiguity, especially in high-dimensional transcriptomics. This type of ambiguity can be conveniently expressed through granular counts, namely fuzzy-valued observations of latent discrete quantities. We study a class of fuzzy-reporting mechanisms and show that, when reporting exploits graded membership, ignorability fails generically, leading to a coarsening-not-at-random structure. A hierarchical model is then introduced as a tractable instance of this construction and illustrated using RNA-seq data.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers