Natural Language Processing
Research on language models, text understanding, generation, and computational linguistics.
cs.CL Β· 805 papersTrajectory as the Teacher: Few-Step Discrete Flow Matching via Energy-Navigated Distillation
TS-DFM improves discrete flow matching by guiding trajectory generation with an energy compass, achieving 128x faster text generation.
CoCoReviewBench: A Completeness- and Correctness-Oriented Benchmark for AI Reviewers
CoCoReviewBench is a new benchmark for AI reviewers, focusing on completeness and correctness by curating 3,900 papers with expert annotations.
Beyond "I cannot fulfill this request": Alleviating Rigid Rejection in LLMs via Label Enhancement
LANCE introduces a label enhancement method using variational inference to enable LLMs to provide safe yet flexible and natural responses, avoiding rigid rejections.
KL for a KL: On-Policy Distillation with Control Variate Baseline
vOPD stabilizes On-Policy Distillation for LLMs by applying an RL control variate baseline, significantly improving reasoning performance efficiently.
MatryoshkaLoRA: Learning Accurate Hierarchical Low-Rank Representations for LLM Fine-Tuning
MatryoshkaLoRA introduces a novel framework for LLM fine-tuning, enabling accurate hierarchical low-rank representations and dynamic rank selection.
Measuring and Mitigating the Distributional Gap Between Real and Simulated User Behaviors
This paper introduces a method to measure the distributional gap between real and simulated user behaviors, evaluating 24 LLM-based simulators.
SCENE: Recognizing Social Norms and Sanctioning in Group Chats
SCENE is a new benchmark for evaluating LLMs' ability to recognize and adapt to implicit social norms and sanctions in group chats.
TRACE: Tourism Recommendation with Accountable Citation Evidence
TRACE introduces a new dataset and benchmark for conversational tourism recommender systems, focusing on verifiable evidence and rejection recovery.
InterLV-Search: Benchmarking Interleaved Multimodal Agentic Search
InterLV-Search is a new benchmark for interleaved language-vision agentic search, revealing current multimodal agents struggle with complex visual evidence integration.
TCMIIES: A Browser-Based LLM-Powered Intelligent Information Extraction System for Academic Literature
TCMIIES is a browser-based, zero-installation system leveraging commercial LLMs for privacy-preserving, schema-guided information extraction from academic literature.
DiffRetriever: Parallel Representative Tokens for Retrieval with Diffusion Language Models
DiffRetriever uses diffusion language models to generate multiple representative tokens in parallel, significantly improving retrieval performance over sequential autoregressive methods.
Topic Is Not Agenda: A Citation-Community Audit of Text Embeddings
Text embeddings fail to capture fine-grained research agendas, leading to 80% off-agenda retrievals in scientific RAG.
Bridging Textual Profiles and Latent User Embeddings for Personalization
BLUE unifies interpretable textual user profiles with discriminative latent embeddings using reinforcement learning for personalized recommendations.
From Surface Learning to Deep Understanding: A Grounded AI Tutoring System for Moodle
A Moodle plugin uses RAG and LLMs for Socratic tutoring and educator content generation, ensuring high-quality, hallucination-free education.
EMO: Pretraining Mixture of Experts for Emergent Modularity
EMO is a new Mixture-of-Experts model that achieves emergent modularity, allowing efficient selective expert use for memory-constrained LLM deployment.
Verifier-Backed Hard Problem Generation for Mathematical Reasoning
VHG is a novel verifier-enhanced framework for generating valid and challenging mathematical problems for LLMs, outperforming existing methods.
When No Benchmark Exists: Validating Comparative LLM Safety Scoring Without Ground-Truth Labels
This paper introduces a method for validating comparative LLM safety scoring without ground-truth labels, using an instrumental-validity chain.
Beyond Negative Rollouts: Positive-Only Policy Optimization with Implicit Negative Gradients
POPO is a novel RLVR framework for LLMs that learns exclusively from positive rollouts, achieving strong performance by implicitly deriving negative gradients.
StraTA: Incentivizing Agentic Reinforcement Learning with Strategic Trajectory Abstraction
StraTA introduces strategic trajectory abstraction to agentic RL, improving LLM performance in long-horizon tasks by enhancing exploration and credit assignment.
Recursive Agent Optimization
Recursive Agent Optimization (RAO) trains agents to recursively delegate sub-tasks, enabling them to scale and generalize more effectively.
π¬ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week β summarized, scored, and delivered to your inbox every Monday.