Joey Chan
2 papers ยท Latest:
Natural Language Processing
MedHopQA: A Disease-Centered Multi-Hop Reasoning Benchmark and Evaluation Framework for LLM-Based Biomedical Question Answering
MedHopQA is a new disease-centered multi-hop reasoning benchmark for evaluating LLMs in biomedical QA, designed to resist saturation and contamination.
2605.12361
Natural Language ProcessingOverview of the MedHopQA track at BioCreative IX: track description, participation and evaluation of systems for multi-hop medical question answering
The MedHopQA track benchmarked LLMs on multi-hop medical QA with a new 1,000-pair dataset, highlighting RAG's importance for strong performance.
2605.12313
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.