Non-ignorable fuzziness in granular counts: the case of RNA-seq data
Antonio Calcagnì, Arianna Consiglio, Przemyslaw Grzegorzewski, Corrado Mencar
TLDR
This paper shows that fuzzy counts in RNA-seq data lead to non-ignorable reporting mechanisms and introduces a hierarchical model to address this.
Key contributions
- RNA-seq data often contains "granular counts" due to read-to-gene alignment ambiguity.
- Demonstrates that fuzzy reporting mechanisms lead to non-ignorable "coarsening-not-at-random" structures.
- Proposes a tractable hierarchical model to address and illustrate this non-ignorable fuzziness in RNA-seq.
Why it matters
This paper identifies non-ignorable fuzziness in RNA-seq data, which can bias results if not accounted for. It introduces a novel hierarchical model to correctly handle this ambiguity, improving transcriptomics study reliability.
Original Abstract
RNA-seq count data are often affected by read-to-gene alignment ambiguity, especially in high-dimensional transcriptomics. This type of ambiguity can be conveniently expressed through granular counts, namely fuzzy-valued observations of latent discrete quantities. We study a class of fuzzy-reporting mechanisms and show that, when reporting exploits graded membership, ignorability fails generically, leading to a coarsening-not-at-random structure. A hierarchical model is then introduced as a tractable instance of this construction and illustrated using RNA-seq data.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.