Hierarchical Long-Term Semantic Memory for LinkedIn's Hiring Agent
Zhentao Xu, Shangjing Zhang, Emir Poyraz, Yvonne Li, Ye Jin + 5 more
TLDR
Introduces HLTM, a hierarchical long-term semantic memory framework for LLM agents, improving performance and scalability in production.
Key contributions
- Introduces Hierarchical Long-Term Semantic Memory (HLTM) for LLM agents.
- Organizes data into a schema-aligned memory tree for multi-granularity semantic knowledge.
- Enables scalable ingestion, privacy-aware storage, and low-latency retrieval.
- Improves answer correctness and retrieval F1 by over 10% on LinkedIn's Hiring Assistant.
Why it matters
This paper tackles key challenges in building industrial-grade long-term memory for LLM agents, including scalability, privacy, and latency. HLTM offers a robust solution, showing significant performance gains and successful deployment in LinkedIn's Hiring Assistant. It provides a practical framework for enhancing personalized agent interactions.
Original Abstract
Large Language Model (LLM) agents are increasingly used in real-world products, where personalized and context-aware user interactions are essential. A central enabler of such capabilities is the agent's long-term semantic memory system, which extracts implicit and explicit signals from noisy longitudinal behavioral data, stores them in a structured form, and supports low-latency retrieval. Building industrial-grade long-term memory for LLM agents raises five challenges: scalability, low-latency retrieval, privacy constraints, cross-domain generalizability, and observability. We introduce the Hierarchical Long-Term Semantic Memory (HLTM) framework, which organizes textual data into a schema-aligned memory tree that captures semantic knowledge at multiple levels of granularity, enabling scalable ingestion, privacy-aware storage, low-latency retrieval, and transparent provenance; HLTM further incorporates an adaptation mechanism to generalize across diverse use cases. Extensive evaluations on LinkedIn's Hiring Assistant show that HLTM improves answer correctness and retrieval F1 significantly by more than 10%, while significantly advancing the Pareto frontier between query and indexing latency. HLTM has been deployed in LinkedIn's Hiring Assistant to power core personalization features in production hiring workflows.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.