Joey Chan

2 papers · Latest: May 12, 2026

MedHopQA: A Disease-Centered Multi-Hop Reasoning Benchmark and Evaluation Framework for LLM-Based Biomedical Question Answering

MedHopQA is a new disease-centered multi-hop reasoning benchmark for evaluating LLMs in biomedical QA, designed to resist saturation and contamination.

2605.12361May 12, 2026

Natural Language Processing

Overview of the MedHopQA track at BioCreative IX: track description, participation and evaluation of systems for multi-hop medical question answering

The MedHopQA track benchmarked LLMs on multi-hop medical QA with a new 1,000-pair dataset, highlighting RAG's importance for strong performance.

2605.12313May 12, 2026

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.