ArXiv TLDR

KG-First, LLM-Fallback: A Hybrid Microservice for Grounded Skill Search and Explanation

🐦 Tweet
2605.01582

Ngoc Luyen Le, Marie-Hélène Abel, Bertrand Laforge

cs.IRcs.AI

TLDR

SkillGraph-Service unifies complex competency frameworks into a KG, using a KG-first, LLM-fallback approach for efficient skill search and explanation.

Key contributions

  • Unifies diverse competency frameworks (ESCO, O*NET) into a provenance-preserving Knowledge Graph.
  • Implements a hybrid retrieval engine (FTS5 + HNSW) for efficient, low-latency skill search.
  • Uses LLMs for constrained ranking and audience-aware explanations, balancing fluency and faithfulness.
  • Achieves high retrieval effectiveness (nDCG@5 > 0.94) with sub-200ms latency.

Why it matters

This paper introduces a practical and scalable microservice, SkillGraph-Service, that simplifies the integration of complex skill data into digital learning platforms. By combining Knowledge Graphs with LLMs, it offers an auditable solution for educators to access and utilize authoritative competency frameworks efficiently.

Original Abstract

Authoritative competency frameworks such as ESCO, ROME, and O*NET are essential for aligning education with labor market needs, yet their technical complexity and structural heterogeneity hinder practical adoption by educators. This paper introduces SkillGraph-Service, an interoperable microservice designed to bridge this gap by unifying these resources into a provenance-preserving Knowledge Graph (KG). Adopting a KG-first, LLM-fallback architecture, the system combines symbolic rigor with sub-symbolic flexibility. It implements a lightweight hybrid retrieval engine (fusing SQLite FTS5 and HNSW vector search) to handle the vocabulary mismatch in educator queries, and utilizes Large Language Models (LLMs) strictly for constrained ranking and audience-aware explanation. Empirical evaluation on a multilingual dataset reveals that the proposed hybrid strategy achieves superior retrieval effectiveness (nDCG@5>0.94) with sub-200 ms latency, rendering computationally expensive cross-encoder re-ranking may be unnecessary for this domain. Furthermore, an analysis of generated explanations highlights a trade-off between fluency and faithfulness: while JSON-constrained LLMs ensure high citation precision, deterministic templates remain the most reliable method for maximizing evidence coverage. The resulting architecture offers a practical, scalable, and auditable solution for integrating complex skill data into digital learning ecosystems.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.