Personalized Deep Research: A User-Centric Framework, Dataset, and Hybrid Evaluation for Knowledge Discovery

May 11, 20262605.10530

Xiaopeng Li, Wenlin Zhang, Yingyi Zhang, Pengyue Jia, Yejing Wang + 4 more

cs.IR

TLDR

PDR is a user-centric framework that personalizes deep research agents by adapting retrieval and synthesis to individual user expertise and interests.

Key contributions

Introduces Personalized Deep Research (PDR) framework for user-adaptive knowledge discovery.
Integrates dynamic user context into retrieval-reasoning, query development, and dual-stage retrieval.
Releases the PDR Dataset, covering four realistic user tasks, for benchmarking personalized research.
Proposes a hybrid evaluation framework combining lexical metrics with LLM-based judgments.

Why it matters

This paper addresses a critical limitation in current LLM-driven research agents by enabling personalized knowledge discovery. PDR ensures research outputs are tailored to individual user needs, preventing information overload or redundancy. This advancement bridges the gap between generic information retrieval and truly adaptive knowledge acquisition.

Original Abstract

Deep Research agents driven by LLMs have automated the scholarly discovery pipeline, from planning and query formulation to iterative web exploration. Yet they remain constrained by a static, ``one-size-fits-all'' retrieval paradigm. Current systems fail to adaptively adjust the depth and breadth of exploration based on the user's existing expertise or latent interests, frequently resulting in reports that are either redundant for experts or overly dense for novices. To address this, we introduce Personalized Deep Research (PDR), a framework that integrates dynamic user context into the core retrieval-reasoning loop. Rather than treating personalization as a post-hoc formatting step, PDR unifies user profile modeling with iterative query development, dual-stage (private/public) retrieval, and context-aware synthesis. This allows the system to autonomously align research sub-goals with user intent and optimize the stopping criteria for evidence collection. To facilitate benchmarking, we release the PDR Dataset, covering four realistic user tasks, and propose a hybrid evaluation framework combining lexical metrics with LLM-based judgments to assess factual accuracy and personalization alignment. Experimental results against commercial baselines demonstrate that PDR significantly improves retrieval utility and report relevance, effectively bridging the gap between generic information retrieval and personalized knowledge acquisition. The resource is available to the public at https://github.com/Applied-Machine-Learning-Lab/SIGIR2026_PDR.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers