Yang Song
3 papers ยท Latest:
Aligning Language Models for Lyric-to-Melody Generation with Rule-Based Musical Constraints
This paper introduces a novel alignment framework using rule-based musical constraints and DPO/KTO to improve LLM-generated melodies, reducing musical implausibility.
2604.18489
Natural Language ProcessingGemini: A Family of Highly Capable Multimodal Models
Gemini is a new family of multimodal AI models excelling in image, audio, video, and text understanding, achieving state-of-the-art results across numerous benchmarks including human-expert level on MMLU.
2312.11805
Natural Language ProcessingGPT-4 Technical Report
GPT-4 is a large-scale multimodal Transformer model achieving human-level performance on professional and academic benchmarks through advanced training and alignment techniques.
2303.08774
๐ฌ Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week โ summarized, scored, and delivered to your inbox every Monday.