Fazle Rabbi
3 papers ยท Latest:
Software Engineering
HEJ-Robust: A Robustness Benchmark for LLM-Based Automated Program Repair
HEJ-Robust benchmark reveals LLM-based program repair models lack robustness to minor syntactic variations, with performance drops over 50%.
2605.02215
Software EngineeringBeyond Translation Accuracy: Addressing False Failures in LLM-Based Code Translation
Many reported LLM code translation failures are false negatives caused by evaluation setup, not logical errors, demanding better evaluation standards.
2605.02195
Software EngineeringSocial Bias in LLM-Generated Code: Benchmark and Mitigation
LLM-generated code has severe social bias. A new Fairness Monitor Agent reduces bias by 65% and improves functional correctness without modifying pipelines.
2605.00382
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.