ArXiv TLDR

Philip Torr

6 papers ยท Latest:

Computer Vision

ActCam: Zero-Shot Joint Camera and 3D Motion Control for Video Generation

ActCam enables zero-shot joint 3D motion and camera control for video generation, improving fidelity and camera adherence with staged guidance.

2605.06667
Natural Language Processing

StraTA: Incentivizing Agentic Reinforcement Learning with Strategic Trajectory Abstraction

StraTA introduces strategic trajectory abstraction to agentic RL, improving LLM performance in long-horizon tasks by enhancing exploration and credit assignment.

2605.06642
Artificial Intelligence

Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond

This paper introduces a "levels x laws" taxonomy for agentic world models, synthesizing over 400 works and outlining a roadmap for future development.

2604.22748
Machine Learning

LongCoT: Benchmarking Long-Horizon Chain-of-Thought Reasoning

LongCoT is a new benchmark with 2,500 expert-designed problems to measure long-horizon chain-of-thought reasoning in frontier language models.

2604.14140
Computer Vision

ActionParty: Multi-Subject Action Binding in Generative Video Games

ActionParty is a new video world model that enables multi-subject action control in generative video games by disentangling subject states.

2604.02330
Computer Vision

Res2Net: A New Multi-scale Backbone Architecture

Res2Net introduces a novel CNN building block that enhances multi-scale feature representation within a single residual block, improving performance across various vision tasks.

1904.01169

๐Ÿ“ฌ Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week โ€” summarized, scored, and delivered to your inbox every Monday.