High-dimensional Many-to-many-to-many Mediation Analysis
Tien Dat Nguyen, Trung Khang Tran, Cong Khanh Truong, Duy-Cat Can, Binh T. Nguyen + 1 more
TLDR
Introduces a high-dimensional many-to-many-to-many (MMM) mediation analysis framework for variable selection, effect estimation, and outcome prediction.
Key contributions
- Develops MMM mediation for variable selection in high-dimensional exposures/mediators.
- Estimates indirect effect matrices and predicts multivariate outcomes.
- Provides theoretical guarantees for consistency and asymptotic normality of estimators.
- Successfully applied to ADNI data, revealing genetic-neural-cognitive pathways.
Why it matters
This framework addresses the challenge of analyzing complex, high-dimensional multi-layer pathways in scientific data. It provides a robust statistical tool for identifying interpretable relationships and improving predictive performance in fields like neuroscience.
Original Abstract
We study high-dimensional mediation analysis in which exposures, mediators, and outcomes are all multivariate, and both exposures and mediators may be high-dimensional. We formalize this as a many (exposures)-to-many (mediators)-to-many (outcomes) (MMM) mediation analysis problem. Methodologically, MMM mediation analysis simultaneously performs variable selection for high-dimensional exposures and mediators, estimates the indirect effect matrix (i.e., the coefficient matrices linking exposure-to-mediator and mediator-to-outcome pathways), and enables prediction of multivariate outcomes. Theoretically, we show that the estimated indirect effect matrices are consistent and element-wise asymptotically normal, and we derive error bounds for the estimators. To evaluate the efficacy of the MMM mediation framework, we first investigate its finite-sample performance, including convergence properties, the behavior of the asymptotic approximations, and robustness to noise, via simulation studies. We then apply MMM mediation analysis to data from the Alzheimer's Disease Neuroimaging Initiative to study how cortical thickness of 202 brain regions may mediate the effects of 688 genome-wide significant single nucleotide polymorphisms (SNPs) (selected from approximately 1.5 million SNPs) on eleven cognitive-behavioral and diagnostic outcomes. The MMM mediation framework identifies biologically interpretable, many-to-many-to-many genetic-neural-cognitive pathways and improves downstream out-of-sample classification and prediction performance. Taken together, our results demonstrate the potential of MMM mediation analysis and highlight the value of statistical methodology for investigating complex, high-dimensional multi-layer pathways in science. The MMM package is available at https://github.com/THELabTop/MMM-Mediation.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.