Domain-Adaptive Dense Retrieval for Brazilian Legal Search
Jayr Pereira, Roberto Lotufo, Luiz Bonifacio
TLDR
This paper explores domain-adaptive dense retrieval for Brazilian legal search, finding a mixed training approach offers robust performance.
Key contributions
- Explored three Qwen3-Embedding-4B training setups for Brazilian legal search: base, legal-only, and mixed.
- Legal-only model excelled in specialized legal tasks, demonstrating strong domain adaptation.
- Mixed training (legal + SQuAD-pt) improved average NDCG@10, MRR@10, and MAP@10 across six datasets.
- Mixed setup showed superior robustness and out-of-domain generalization, especially on question-based search.
Why it matters
This research addresses the challenge of heterogeneous Brazilian legal search by evaluating domain-adaptive dense retrievers. It offers practical insights into balancing specialization and robustness, providing models that significantly improve retrieval performance across diverse legal and general Portuguese queries.
Original Abstract
Brazilian legal retrieval is heterogeneous, covering case law, legislation, and question-based search. This makes training dense retrievers a trade-off between stronger domain specialization and broader robustness across retrieval types of search. In this paper, we explore this trade-off using three training setups based on Qwen3-Embedding-4B: a base model with no fine-tuning, a version trained only on legal data, and a mixed setup that combines legal data with SQuAD-pt supervised dataset. We evaluate these models on five legal datasets from the JUÁ leaderboard, along with Quati dataset as an extra Portuguese retrieval benchmark to test out-of-domain generalization. The legal-only model performs best on the most specialized legal tasks. The mixed setup keeps strong performance on legal data while offering a better overall balance, improving average NDCG@10 from 0.414 to 0.447, MRR@10 from 0.586 to 0.595, and MAP@10 from 0.270 to 0.308 across all six datasets. The biggest improvement appears on Quati, where the mixed model clearly outperforms the legal-only one. Overall, the results show that legal-only and mixed training lead to different strengths: the first is better for specialization, while the second is more robust across different types of search, especially question-based ones. Both adapted models are available on Hugging Face
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.