Joseph E. Gonzalez
4 papers ยท Latest:
Learning, Fast and Slow: Towards LLMs That Adapt Continually
Fast-Slow Training enables LLMs to adapt continually with improved efficiency and less forgetting by combining fast context and slow parameter updates.
Efficient Memory Management for Large Language Model Serving with PagedAttention
PagedAttention introduces a virtual memory-inspired method to efficiently manage key-value cache memory in large language model serving, significantly boosting throughput and reducing memory waste.
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
This paper demonstrates that strong large language models like GPT-4 can effectively serve as judges to evaluate other LLM-based chat assistants, closely matching human preferences on open-ended tasks.
Gorilla: Large Language Model Connected with Massive APIs
Gorilla is a finetuned LLaMA-based model that outperforms GPT-4 in generating accurate API calls by integrating document retrieval to reduce hallucinations and adapt to evolving API documentation.
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.