ECLASS-Augmented Semantic Product Search for Electronic Components
Nico Baumgart, Markus Lange-Hegermann, Jan Henze
TLDR
This paper introduces ECLASS-augmented LLM-assisted dense retrieval for semantic product search, vastly outperforming lexical and foundation model baselines.
Key contributions
- Evaluates LLM-assisted dense retrieval for semantic product search on electronic components.
- Integrates ECLASS hierarchical semantics into embedding-based retrieval for improved search.
- Dense retrieval with re-ranking achieves 94.3% Hit_Rate@5, outperforming BM25 (31.4%).
- ECLASS augmentation consistently boosts performance, linking user intent to product descriptions.
Why it matters
This work addresses the critical vocabulary mismatch in industrial product search, vital for factory automation and LLM-based agent workflows. It significantly improves the efficiency and accuracy of identifying suitable electronic components from complex catalogs. The proposed method offers a robust solution for semantic access to industrial data.
Original Abstract
Efficient semantic access to industrial product data is a key enabler for factory automation and emerging LLM-based agent workflows, where both human engineers and autonomous agents must identify suitable components from highly structured catalogs. However, the vocabulary mismatch between natural-language queries and attribute-centric product descriptions limits the effectiveness of traditional retrieval approaches, e.g., BM25. In this work, we present a systematic evaluation of LLM-assisted dense retrieval for semantic product search on industrial electronic components, and investigate the integration of hierarchical semantics from the ECLASS standard into embedding-based retrieval. Our results show that dense retrieval combined with re-ranking substantially outperforms classical lexical methods and foundation model web-search baselines. In particular, the proposed approach achieves a Hit_Rate@5 of 94.3 %, compared to 31.4 % for BM25 on expert queries, while also exceeding foundation model baselines in both effectiveness and efficiency. Furthermore, augmenting product representations with ECLASS semantics yields consistent performance gains across configurations, demonstrating that standardized hierarchical metadata provides a crucial semantic bridge between user intent and sparse product descriptions.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.