Fazle Rabbi

3 papers · Latest: May 4, 2026

HEJ-Robust: A Robustness Benchmark for LLM-Based Automated Program Repair

HEJ-Robust benchmark reveals LLM-based program repair models lack robustness to minor syntactic variations, with performance drops over 50%.

2605.02215May 4, 2026

Software Engineering

Beyond Translation Accuracy: Addressing False Failures in LLM-Based Code Translation

Many reported LLM code translation failures are false negatives caused by evaluation setup, not logical errors, demanding better evaluation standards.

2605.02195May 4, 2026

Software Engineering

Social Bias in LLM-Generated Code: Benchmark and Mitigation

LLM-generated code has severe social bias. A new Fairness Monitor Agent reduces bias by 65% and improves functional correctness without modifying pipelines.

2605.00382May 1, 2026

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.