LLM-Rosetta: A Hub-and-Spoke Intermediate Representation for Cross-Provider LLM API Translation

April 10, 20262604.09360

cs.SEcs.AI

TLDR

LLM-Rosetta introduces a hub-and-spoke IR to translate between diverse LLM APIs, enabling portability and multi-provider architectures with minimal overhead.

Key contributions

Uses a hub-and-spoke Intermediate Representation (IR) with a 9-type content model for semantic core translation.
Enables bidirectional conversion for requests and responses, including stateful chunk-level streaming.
Supports major LLM providers like OpenAI, Anthropic, and Google GenAI with implemented converters.
Achieves lossless fidelity and sub-100 microsecond conversion overhead, competitive with existing solutions.

Why it matters

The fragmentation of LLM APIs creates significant vendor lock-in and integration challenges. LLM-Rosetta provides a crucial open-source solution by offering a universal translation layer. This enables developers to build portable applications, easily switch providers, and leverage multi-provider strategies without complex O(N^2) adapters.

Original Abstract

The rapid proliferation of Large Language Model (LLM) providers--each exposing proprietary API formats--has created a fragmented ecosystem where applications become tightly coupled to individual vendors. Switching or bridging providers requires $O(N^2)$ bilateral adapters, impeding portability and multi-provider architectures. We observe that despite substantial syntactic divergence, the major LLM APIs share a common semantic core: the practical challenge is the combinatorial surface of syntactic variations, not deep semantic incompatibility. Based on this finding, we present LLM-Rosetta, an open-source translation framework built on a hub-and-spoke Intermediate Representation (IR) that captures the shared semantic core--messages, content parts, tool calls, reasoning traces, and generation controls--in a 9-type content model and 10-type stream event schema. A modular Ops-composition converter architecture enables each API standard to be added independently. LLM-Rosetta supports bidirectional conversion (provider-to-IR-to-provider) for both request and response payloads, including chunk-level streaming with stateful context management. We implement converters for four API standards (OpenAI Chat Completions, OpenAI Responses, Anthropic Messages, and Google GenAI), covering the vast majority of commercial providers. Empirical evaluation demonstrates lossless round-trip fidelity, correct streaming behavior, and sub-100 microsecond conversion overhead--competitive with LiteLLM's single-pass approach while providing bidirectionality and provider neutrality. LLM-Rosetta passes the Open Responses compliance suite and is deployed in production at Argonne National Laboratory. Code is available at https://github.com/Oaklight/llm-rosetta.

View on arXiv Download PDF

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.

TLDR

Key contributions

Why it matters

Original Abstract

📬 Weekly AI Paper Digest

Related papers