ArXiv TLDR

Research Topics

Browse papers by what people actually search for — LLMs, diffusion, RAG, agents, and more. Cross-cuts the arXiv taxonomy.

Large Language Models (LLMs)

Research on large language models — pretraining, scaling, evaluation, alignment, and inference.

Transformers

Papers on transformer architectures, attention mechanisms, and their variants.

Diffusion Models

Generative diffusion models for images, video, audio, and 3D — DDPM, score-based, latent diffusion, and beyond.

Retrieval-Augmented Generation (RAG)

RAG systems — combining retrieval with generative models for grounded, up-to-date answers.

Mixture of Experts (MoE)

Sparse and dense mixture-of-experts architectures — routing, capacity, and efficient scaling.

RLHF & Preference Learning

Reinforcement learning from human feedback, DPO, and other preference-based alignment methods.

Fine-Tuning & LoRA

Efficient adaptation of pretrained models — LoRA, adapters, prefix tuning, and supervised fine-tuning.

Vision Transformers (ViT)

Transformer architectures applied to vision — ViT, DeiT, Swin, and downstream visual recognition.

AI Agents

Autonomous LLM-powered agents — planning, tool use, multi-step reasoning, and benchmarks.

Reasoning & Chain-of-Thought

Reasoning in language models — chain-of-thought, tree-of-thought, self-consistency, and step-level supervision.

Multimodal Models

Vision-language and multimodal foundation models — CLIP, LLaVA, VLMs, and cross-modal grounding.

World Models

Learned world models for planning, control, and embodied agents — Dreamer, JEPA, and successors.

Scaling Laws

Empirical scaling laws for model size, compute, and data — Chinchilla, emergent capabilities, and predictability.

In-Context Learning

Few-shot and in-context learning in LLMs — prompting, demonstration selection, and mechanistic explanations.

AI Alignment & Safety

Alignment, safety, interpretability, and red-teaming for frontier AI systems.

Quantization & Model Compression

Post-training and quantization-aware compression — INT8/INT4, GPTQ, AWQ, and pruning.

Knowledge Distillation

Distilling large teacher models into smaller, faster students for deployment.

Reinforcement Learning

Reinforcement learning — policy gradients, model-based RL, offline RL, and continuous control.

Graph Neural Networks (GNNs)

GNNs and message-passing — graph attention, GCNs, and applications in chemistry, biology, and recommendation.

Robotic Manipulation & Embodied AI

Learning-based robotic manipulation, dexterity, and embodied agents in physical and simulated worlds.

Code Generation & AI for Code

LLM-based code generation, code completion, debugging, and program synthesis.

Speech & Audio Models

Speech recognition, text-to-speech, music generation, and audio foundation models.

3D Generation & NeRF

3D scene reconstruction and synthesis — NeRF, Gaussian Splatting, and text-to-3D.

Mechanistic Interpretability

Reverse-engineering neural networks — circuits, features, sparse autoencoders, and probing.

LLM Evaluation & Benchmarks

Benchmarks, LLM-as-judge, evaluation frameworks, and methodology for measuring model capability.

Long Context Modeling

Extending context windows — efficient attention, position encodings, retrieval, and long-document tasks.

Tool Use & Function Calling

LLMs calling external tools, APIs, and functions — ReAct, function calling, and tool-augmented reasoning.

Synthetic Data Generation

Generating synthetic training data — distillation from LLMs, data augmentation, and self-improvement loops.

State Space Models (Mamba & SSMs)

Selective state space models and linear-time sequence architectures — Mamba, S4, and SSM variants that rival transformers on long sequences.

Test-Time Compute & Inference Scaling

Scaling reasoning at inference time — test-time compute, search, and deliberate o1-style inference for harder problems.

Federated Learning

Privacy-preserving distributed training across decentralized data — federated optimization, aggregation, and on-device learning.

Continual & Lifelong Learning

Learning without forgetting — continual, lifelong, and incremental learning that resists catastrophic forgetting.

Text Embeddings & Dense Retrieval

Learned text and sentence embeddings for semantic search and dense retrieval — contrastive representation learning and embedding models.

Watermarking & AI Content Provenance

Watermarking and provenance for AI-generated text and images — embedding, detection, robustness, and attribution of model outputs.

AI for Science

Machine learning for scientific discovery — protein structure, molecular and materials modeling, and drug discovery.

Time Series & Forecasting

Time-series modeling and forecasting — deep and foundation models for temporal data, plus anomaly detection.

Video Generation

Generative models for video — text-to-video diffusion, world simulators, and temporally consistent synthesis.

Speculative Decoding

Accelerating LLM inference with speculative and parallel decoding — draft models, verification, and serving speedups.

Model Merging

Combining trained models without retraining — weight averaging, task arithmetic, and merging fine-tuned checkpoints.

Prompt Engineering

Prompting methods for LLMs — prompt design, optimization, chain-of-thought prompting, and automatic prompt search.

Jailbreaks & Red-Teaming

Adversarial attacks on LLMs — jailbreaks, prompt injection, red-teaming, and safety robustness evaluation.

Self-Supervised Learning

Learning representations without labels — pretext tasks, masked modeling, and self-supervised pretraining.

Neural Radiance Fields (NeRF)

Neural scene representations for novel-view synthesis — NeRF and 3D reconstruction.

LoRA & Parameter-Efficient Tuning

Parameter-efficient fine-tuning of large models — LoRA, adapters, and low-rank adaptation methods.

Vision-Language Models (VLMs)

Models that jointly understand images and text — CLIP, captioning, visual question answering, and multimodal LLMs.

Object Detection

Detecting and localizing objects in images and video — DETR, YOLO, and modern detection architectures.

Gaussian Splatting

3D Gaussian Splatting for real-time radiance-field rendering, 3D reconstruction, and novel-view synthesis.

Image Segmentation

Pixel-level scene understanding — semantic, instance, and panoptic segmentation.

Recommender Systems

Recommendation and personalization — collaborative filtering, sequential and LLM-based recommenders, and ranking.

Autonomous Driving

Self-driving perception, planning, and control — autonomous vehicles, end-to-end driving, and trajectory prediction.

Anomaly Detection

Detecting outliers and out-of-distribution inputs — anomaly detection, OOD detection, and novelty detection.

Contrastive Learning

Self-supervised representation learning by contrasting positive and negative pairs — SimCLR, MoCo, and InfoNCE.

Medical Imaging

Deep learning for medical images — segmentation, diagnosis, and analysis of CT, MRI, X-ray, and pathology data.

Protein Structure & Design

Computational protein science — structure prediction, protein folding, and generative protein design.