Hailiang Huang
2 papers ยท Latest:
Natural Language Processing
FinSafetyBench: Evaluating LLM Safety in Real-World Financial Scenarios
FinSafetyBench is a new bilingual red-teaming benchmark evaluating LLM safety and compliance in real-world financial scenarios, revealing vulnerabilities.
2605.00706
Software EngineeringCascaded Code Editing: Large-Small Model Collaboration for Effective and Efficient Code Editing
This paper proposes Cascaded Code Editing, combining large models for edit sketch generation and small models for efficient application.
2604.19201
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.