ArXiv TLDR

Yi Zhang

12 papers ยท Latest:

Information Retrieval

ReCoVR: Closing the Loop in Interactive Composed Video Retrieval

ReCoVR introduces a dual-pathway architecture for interactive composed video retrieval, using reflexive perception to refine search with user feedback and retrieval history.

2605.09836
Information Retrieval

PRISM: Refracting the Entangled User Behavior Space for E-Commerce Search

PRISM disentangles user preference and item relevance in e-commerce search by explicitly modeling their interaction, improving robustness and semantic consistency.

2605.07296
Econometrics

Penalized Likelihood for Dyadic Network Formation Models with Degree Heterogeneity

This paper introduces a penalized likelihood method to robustly estimate dyadic network formation models, addressing existence and bias issues from degree heterogeneity.

2605.00771
Information Retrieval

ProMax: Exploring the Potential of LLM-derived Profiles with Distribution Shaping for Recommender Systems

ProMax uses LLM-derived profiles and distribution shaping to significantly improve recommender systems by guiding models to learn unseen item preferences.

2604.26231
Computer Vision

Vision SmolMamba: Spike-Guided Token Pruning for Energy-Efficient Spiking State-Space Vision Models

Vision SmolMamba introduces spike-guided token pruning in a state-space model for energy-efficient spiking vision, achieving superior accuracy-efficiency.

2604.25570
Machine Learning

FreeScale: Distributed Training for Sequence Recommendation Models with Minimal Scaling Cost

FreeScale optimizes distributed training for sequence recommendation models, reducing computational bubbles by up to 90.3% on 256 H100 GPUs.

2604.24073
Information Retrieval

Disagreement as Signals: Dual-view Calibration for Sequential Recommendation Denoising

DC4SR denoises sequential recommendations by calibrating semantic priors from LLMs with model learning dynamics to handle evolving user interests.

2604.24048
Machine Learning

Supplement Generation Training for Enhancing Agentic Task Performance

SGT trains small LLMs to generate supplemental text, boosting large LLM performance on agentic tasks without costly retraining.

2604.20727
Information Retrieval

SID-Coord: Coordinating Semantic IDs for ID-based Ranking in Short-Video Search

SID-Coord enhances short-video search ranking by integrating trainable semantic IDs to balance memorization and generalization, improving long-tail item performance.

2604.10471
Computer Vision

Arbitration Failure, Not Perceptual Blindness: How Vision-Language Models Resolve Visual-Linguistic Conflicts

VLMs often fail to act on what they see, encoding visual info but struggling with arbitration, not perception, which targeted interventions can improve.

2604.09364
Information Retrieval

DIAURec: Dual-Intent Space Representation Optimization for Recommendation

DIAURec optimizes user and item representations for recommendations by unifying intent and language modeling with a comprehensive optimization strategy.

2604.09087
Economic Theory

Coarse Screening

This paper shows that sellers investigating buyers before pricing need only three signal outcomes per buyer, even with rich type spaces, due to limited liability.

2604.04405

๐Ÿ“ฌ Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week โ€” summarized, scored, and delivered to your inbox every Monday.