Fuzzy Fingerprinting Encoder Pre-trained Language Models for Emotion Recognition in Conversations: Human Assessment and Validity Study
Patrícia Pereira, Helena Moniz, Joao Paulo Carvalho
TLDR
This paper introduces Fuzzy Fingerprinting Encoder PLMs for interpretable Emotion Recognition in Conversations, reducing neutral overclassification.
Key contributions
- Combines PLMs with Fuzzy Fingerprints (FFPs) for interpretable Emotion Recognition in Conversations (ERC).
- FFPs create class-specific prototypes from PLM latent space activations, reflecting characteristic patterns.
- Prototypes are derived by ranking and fuzzifying context-dependent embeddings for each emotion.
- Inference matches new utterances to emotion prototypes using a fuzzy similarity function.
Why it matters
This method addresses the lack of interpretability in standard PLMs for ERC, especially with imbalanced datasets. It reduces misclassification of minority emotions as neutral, aligning model decisions with human perception. The approach achieves state-of-the-art performance while providing valuable insights into the classification process.
Original Abstract
In Emotion Recognition in Conversations (ERC), model decisions should align with nuanced human perception and ideally provide insights on the classification process. Standard encoder pre-trained language models (PLMs) are the state-of-the-art at these tasks but offer little insight into why a certain prediction is made. This is especially problematic in imbalanced datasets, where most utterances are labeled as neutral, making these models frequently misclassify minority emotions as the majority neutral class. To tackle this issue, we introduced a novel, interpretable approach to ERC by combining PLMs with Fuzzy Fingerprints (FFPs). FFP provide class-specific prototypes that reflect the characteristic class activation patterns in the PLM's latent space. They are derived by ranking and fuzzifying the activations of the pooled conversational context-dependent embeddings across training instances for each emotion. At inference time, each input utterance is similarly fuzzy fingerprinted and matched to the emotion prototypes using a fuzzy similarity function based on the aggregation of the intersection of the fuzzy sets that define each FFP. Experimental results show that FFP integration reduces overclassification into the neutral class and human evaluation further supports the adequacy of FFP predictions. Our proposed method thus bridges the gap between deep neural inference and human perception, performing at state-of-the-art level while simultaneously offering valuable insights into the classification procedure.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.