ArXiv TLDR

Is It Novel and Why? Fine-Grained Patent Novelty Prediction Based on Passage Retrieval

🐦 Tweet
2605.02392

Valentin Knappich, Anna Hätty, Simon Razniewski, Annemarie Friedrich

cs.CLcs.AIcs.IR

TLDR

This paper introduces FiNE-Patents, a new dataset and LLM-based approach for fine-grained patent novelty prediction via passage retrieval.

Key contributions

  • Introduces FiNE-Patents, a dataset of 3,658 patent claims with feature-level prior art annotations.
  • Proposes a novel evaluation paradigm: joint retrieval and abstract reasoning for feature-level novelty.
  • Develops LLM-based workflows that decompose claims and analyze features against prior art.
  • LLM workflows outperform baselines in passage retrieval and are robust to spurious correlations.

Why it matters

Patent novelty assessment is complex and crucial. This work addresses limitations of prior binary classification by offering a more granular, explainable approach. It provides a valuable dataset and LLM-based methods for transparent patent analysis.

Original Abstract

Novelty assessment is a critical yet complex task in the examination process for patent acceptance, requiring examiners to determine whether an invention is disclosed in a prior art document. The process involves intricate matching between specific features of a patent claim and passages in the prior art. While prior work has approached novelty prediction primarily as a binary classification task at the claim level, we argue that this formulation is susceptible to spurious correlations and lacks the granularity required for practical application. In this work, we introduce FiNE-Patents (Fine-grained Novelty Examination of Patents), a novel dataset comprising 3,658 first patent claims annotated with fine-grained, feature-level prior art references extracted from European Search Opinion (ESOP) documents. We propose shifting the evaluation paradigm from simple binary classification to a joint retrieval and abstract reasoning task at the feature level, requiring models to identify specific passages from a prior art document that disclose individual claim features, and to identify which features of a claim make it novel. We implement and evaluate LLM-based workflows that decompose claims into features, analyze each feature against prior art, and finally derive a claim-level novelty prediction. Our experiments demonstrate that these workflows outperform embedding-based baselines on passage retrieval and novel feature identification. Furthermore, we show that unlike trained classifiers, LLMs are robust against spurious correlations present in the claim-level novelty classification task. We release the dataset and code to foster further research into transparent and granular patent analysis.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.