Ge Li

3 papers · Latest: April 24, 2026

RealBench: A Repo-Level Code Generation Benchmark Aligned with Real-World Software Development Practices

RealBench is a new benchmark for repo-level code generation, using structured designs (UML) to better align LLM evaluation with real-world software development.

2604.22659Apr 24, 2026

Software Engineering

Dependency-Guided Repository-Level C-to-Rust Translation with Reinforcement Alignment

DepTrans is a new framework that automates C-to-Rust code migration using reinforcement learning and dependency-guided refinement, achieving high accuracy.

2604.02852Apr 3, 2026

Natural Language Processing

Evaluating the Formal Reasoning Capabilities of Large Language Models through Chomsky Hierarchy

ChomskyBench evaluates LLM formal reasoning across the Chomsky Hierarchy, revealing performance stratification and severe efficiency barriers for complex tasks.

2604.02709Apr 3, 2026

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.