Jacob Hilton
3 papers ยท Latest:
Machine Learning
Estimating the expected output of wide random MLPs more efficiently than sampling
This paper introduces a novel method to estimate the expected output of wide random MLPs without sampling, using cumulants and Hermite expansions.
2605.05179
Natural Language ProcessingTraining language models to follow instructions with human feedback
This paper presents InstructGPT, a method to align language models with user intent by fine-tuning GPT-3 using human feedback, resulting in more truthful, helpful, and less toxic outputs.
2203.02155
Natural Language ProcessingWebGPT: Browser-assisted question-answering with human feedback
WebGPT fine-tunes GPT-3 to answer complex questions by browsing the web and using human feedback to improve factual accuracy and answer quality.
2112.09332
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.