Ge Li
3 papers ยท Latest:
Software Engineering
RealBench: A Repo-Level Code Generation Benchmark Aligned with Real-World Software Development Practices
RealBench is a new benchmark for repo-level code generation, using structured designs (UML) to better align LLM evaluation with real-world software development.
2604.22659
Software EngineeringDependency-Guided Repository-Level C-to-Rust Translation with Reinforcement Alignment
DepTrans is a new framework that automates C-to-Rust code migration using reinforcement learning and dependency-guided refinement, achieving high accuracy.
2604.02852
Natural Language ProcessingEvaluating the Formal Reasoning Capabilities of Large Language Models through Chomsky Hierarchy
ChomskyBench evaluates LLM formal reasoning across the Chomsky Hierarchy, revealing performance stratification and severe efficiency barriers for complex tasks.
2604.02709
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.