Structure Liberates: How Constrained Sensemaking Produces More Novel Research Output

🐦 Tweet

May 1, 20262605.00557

James Mooney, Zae Myung Kim, Young-Jun Lee, Dongyeop Kang

cs.CLcs.AI

TLDR

Structured sensemaking in research ideation boosts novelty and quality in AI-generated scientific outputs.

Key contributions

Introduces SCISENSE, an 8-stage structured framework for scientific ideation.
Creates SCISENSE-Traj, a 100K dataset of citation-based research trajectories.
Develops SCISENSE-LM models (3B-70B params) trained on Target vs. Infer modes.
Target-trained models yield 2% better trajectories and more novel, executable outputs.

Why it matters

This paper shows that structured, targeted ideation improves AI-driven research creativity and quality. It offers tools and data to enhance and study scientific discovery workflows.

Original Abstract

Scientific discovery is an extended process of ideation--surveying prior work, forming hypotheses, and refining reasoning--yet existing approaches treat this phase as a brief preamble despite its central role in research. We introduce SCISENSE, a sensemaking-grounded framework that operationalizes ideation as a structured sequence of eight cognitive stages (Pirolli \& Card, 2005). We construct SCISENSE-Traj, a 100K-scale dataset of citation-conditioned research trajectories in two modes: Target, where an LLM reconstructs the ideation path leading to a known paper from its cited works, and Infer, where the LLM proposes novel directions from the same citations. We distill these into SCISENSE-LM, a family of sensemaking LLMs spanning 3B to 70B parameters. Contrary to the assumption that looser supervision promotes greater exploration, Target-trained models achieve a 2.0\% improvement in trajectory quality over Infer-trained models while also producing more novel and diverse outputs. This advantage propagates downstream: coding agents conditioned on Target trajectories produce research artifacts with higher executability and quality than those conditioned on Infer trajectories. This suggests that targeted ideation reduces cognitive burden on downstream agents, freeing them to explore more creatively. SCISENSE offers both a practical tool for augmenting LLM-driven research workflows and a principled testbed for studying how planning shapes scientific discovery.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers