CLSGen: A Dual-Head Fine-Tuning Framework for Joint Probabilistic Classification and Verbalized Explanation
WonJin Yoon, Kangyu Zhu, Ian Bulovic, Autumn Sehy, Yanjun Gao + 3 more
TLDR
CLSGen is a dual-head LLM fine-tuning framework that enables robust probabilistic classification and verbalized explanations without catastrophic forgetting.
Key contributions
- Enables LLMs to provide robust quantitative probabilities for classification tasks.
- Maintains LLM's explanation-generation capabilities, avoiding catastrophic forgetting.
- Introduces CLSGen framework with novel architecture, training, and data construction.
- Outperforms baselines in AUROC and F1-score, showing strong label-explanation alignment.
Why it matters
LLMs often lose explanation abilities when fine-tuned for probabilistic classification, hindering interpretability. CLSGen addresses this by enabling robust probability estimation while preserving the model's inherent explanation generation. This is crucial for deploying trustworthy and interpretable LLMs in real-world decision-making.
Original Abstract
With the recent progress of Large Language Models (LLMs), there is a growing interest in applying these models to solve complex and challenging problems. Modern LLMs, capable of processing long contexts and generating verbalized explanations, offer significant potential in addressing real-world applications. However, a critical hurdle in deploying LLMs for practical decision-making is their inability to provide reliable, quantitative probabilities. While task-specific fine-tuning of LLMs using traditional discriminative objectives (similar to encoder-only models) can yield probability estimates, this often leads to catastrophic forgetting and linguistic collapse. Consequently, the model loses its ability to generate explanations, severely undermining its interpretability and usability. To address this challenge, we propose CLSGen, a novel LLM fine-tuning framework designed for binary classification tasks. The CLSGen framework encompasses a new model architecture, training methodology, and data construction strategy to enable robust probability estimation without sacrificing the model's inherent explanation-generation capabilities. Experimental results across multiple benchmark datasets demonstrate that models fine-tuned with CLSGen outperform existing baselines in classification metrics (AUROC and F1-score). Regarding explanation, the results showed strong alignment between predicted labels and generated justifications, as well as high readability.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.