MatryoshkaLoRA: Learning Accurate Hierarchical Low-Rank Representations for LLM Fine-Tuning

May 8, 20262605.07850

Ionut-Vlad Modoranu, Mher Safaryan, Dan Alistarh

cs.CLcs.AIcs.LG

TLDR

MatryoshkaLoRA introduces a novel framework for LLM fine-tuning, enabling accurate hierarchical low-rank representations and dynamic rank selection.

Key contributions

Proposes MatryoshkaLoRA, a framework for learning accurate hierarchical low-rank representations.
Inserts a diagonal matrix P between LoRA adapters to scale sub-ranks and ensure efficient gradient flow.
Enables dynamic rank selection for LLM fine-tuning with minimal accuracy degradation.
Introduces AURAC, a new metric for consistently evaluating hierarchical low-rank adapters.

Why it matters

This paper addresses a key limitation of LoRA by enabling dynamic and efficient rank selection without sacrificing accuracy. MatryoshkaLoRA significantly improves the efficiency and performance trade-offs for LLM fine-tuning, making large model adaptation more accessible.

Original Abstract

With the rise in scale for deep learning models to billions of parameters, the computational cost of fine-tuning remains a significant barrier to deployment. While Low-Rank Adaptation (LoRA) has become the standard for parameter-efficient fine-tuning, the need to set a predefined, static rank $r$ requires exhaustive grid searches to balance efficiency and performance. Existing rank-adaptive solutions such as DyLoRA mitigate this by sampling ranks during the training from a predefined distribution. However, they often yield sub-optimal results at higher ranks due to lack of consistent gradient signals across the full hierarchy of ranks, thus making these methods data-inefficient. In this paper, we propose MatryoshkaLoRA, a general, Matryoshka-inspired training framework for LoRA that learns accurate hierarchical low-rank representations by inserting a fixed, carefully crafted diagonal matrix $P$ between the existing LoRA adapters to scale their sub-ranks accordingly. By introducing this simple modification, our general framework recovers LoRA and DyLoRA only by changing $P$ and ensures all sub-ranks embed the available gradient information efficiently. Our MatryoshkaLoRA supports dynamic rank selection with minimal degradation in accuracy. We further propose Area Under the Rank Accuracy Curve (AURAC), a metric that consistently evaluates the performance of hierarchical low-rank adapters. Our results demonstrate that MatryoshkaLoRA learns more accurate hierarchical low-rank representations than prior rank-adaptive approaches and achieves superior accuracy-performance trade-offs across ranks on the evaluated datasets. Our code is available at https://github.com/IST-DASLab/MatryoshkaLoRA.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers