Joseph E. Gonzalez

4 papers · Latest: May 12, 2026

Learning, Fast and Slow: Towards LLMs That Adapt Continually

Fast-Slow Training enables LLMs to adapt continually with improved efficiency and less forgetting by combining fast context and slow parameter updates.

2605.12484May 12, 2026

Machine Learning

Efficient Memory Management for Large Language Model Serving with PagedAttention

PagedAttention introduces a virtual memory-inspired method to efficiently manage key-value cache memory in large language model serving, significantly boosting throughput and reducing memory waste.

2309.06180Sep 12, 2023

Natural Language Processing

Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena

This paper demonstrates that strong large language models like GPT-4 can effectively serve as judges to evaluate other LLM-based chat assistants, closely matching human preferences on open-ended tasks.

2306.05685Jun 9, 2023

Natural Language Processing

Gorilla: Large Language Model Connected with Massive APIs

Gorilla is a finetuned LLaMA-based model that outperforms GPT-4 in generating accurate API calls by integrating document retrieval to reduce hallucinations and adapt to evolving API documentation.

2305.15334May 24, 2023

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.