ArXiv TLDR

Phonological Subspace Collapse Is Aetiology-Specific and Cross-Lingually Stable: Evidence from 3,374 Speakers

🐦 Tweet
2604.21706

Bernard Muller, Antonio Armando Ortiz Barrañón, LaVonne Roberts

cs.CL

TLDR

A training-free method for dysarthria severity assessment reveals aetiology-specific phonological degradation profiles stable across languages and SSL models.

Key contributions

  • Identifies aetiology-specific dysarthria degradation profiles distinguishable at the group level.
  • Demonstrates cross-lingual stability of these phonological degradation profile shapes across 12 languages.
  • Confirms the method's robustness and architecture-independence across six SSL speech backbones.

Why it matters

This research offers a robust, training-free framework for dysarthria characterization. Its findings on aetiology-specific and cross-lingual stability enable language-independent phenotyping, which could improve diagnostics and personalized interventions.

Original Abstract

We previously introduced a training-free method for dysarthria severity assessment based on d-prime separability of phonological feature subspaces in frozen self-supervised speech representations, validated on 890 speakers across 5 languages with HuBERT-base. Here, we scale the analysis to 3,374 speakers from 25 datasets spanning 12 languages and 5 aetiologies (Parkinson's disease, cerebral palsy, ALS, Down syndrome, and stroke), plus healthy controls, using 6 SSL backbones. We report three findings. First, aetiology-specific degradation profiles are distinguishable at the group level: 10 of 13 features yield large effect sizes (epsilon-squared > 0.14, Holm-corrected p < 0.001), with Parkinson's disease separable from the articulatory execution group at Cohen's d = 0.83; individual-level classification remains limited (22.6% macro F1). Second, profiles show cross-lingual profile-shape stability: cosine similarity of 5-dimensional consonant d-prime profiles exceeds 0.95 across the languages available for each aetiology. Absolute d-prime magnitudes are not cross-lingually calibrated, so the method supports language-independent phenotyping of degradation patterns but requires within-corpus calibration for absolute severity interpretation. Third, the method is architecture-independent: all 6 backbones produce monotonic severity gradients with inter-model agreement exceeding rho = 0.77. Fixed-token d-prime estimation preserves the severity correlation (rho = -0.733 at 200 tokens per class), confirming that the signal is not a token-count artefact. These results support phonological subspace analysis as a robust, training-free framework for aetiology-aware dysarthria characterisation, with evidence of cross-lingual profile-shape stability and cross-backbone robustness in the represented sample.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.