HapticLDM: A Diffusion Model for Text-to-Vibrotactile Generation
Jiahao Xiong, Fei Wang, Anran Xu, Pinzhi Huang, Tao Wen + 2 more
TLDR
HapticLDM is the first text-to-vibrotactile diffusion model that generates realistic and semantically aligned haptic feedback from text.
Key contributions
- Introduces HapticLDM, the first text-to-vibration generative model built upon Latent Diffusion Models.
- Employs a novel text-processing strategy to curate high-quality data for dynamic haptic modeling.
- Integrates a global denoising mechanism to ensure coherent and stable temporal envelope variations.
- Achieves enhanced realism and semantic alignment, simplifying the haptic design workflow.
Why it matters
This paper introduces HapticLDM, the first diffusion model for text-to-vibrotactile generation. It overcomes limitations of prior models by capturing global dependencies, yielding more realistic and semantically aligned haptic feedback. This innovation simplifies haptic design and enriches user experiences in games and the metaverse.
Original Abstract
Text-to-vibration generation converts natural language into haptic feedback, enabling vibration-effect designers to get scenarios-fitted vibrations more efficiently, which shows great potentials in application fields such as metaverse, games, and film to enrich the user experience in interactive scenarios. The core challenge in this field is how to generate accurate, consistent, and complete vibrations according to textual semantics. Very recent autoregressive (AR) approaches (e.g., HapticGen) exhibit limited capacity in fully capturing global dependencies, owing to the inherent sequential nature of their modeling and prevailing data constraints. In this paper, we proposed HapticLDM, the first text-to-vibration generative model built upon Latent Diffusion Models (LDMs). Firstly, with respect to the data, we introduced a text-processing strategy that emphasizes dynamic characteristics to curate high-quality data pairs for fine-grained dynamic modeling. Secondly, HapticLDM incorporates a global denoising mechanism that regulates coherent and stable variations in the temporal envelope. Furthermore, we conduct extensive evaluations, including A/B testing against the state-of-the-art baseline and a user study involving 30 participants. The results demonstrate that our model enhances realism and semantic alignment. Qualitative feedback further indicates that HapticLDM simplifies the haptic design workflow while generating diverse, subtle, and physically precise vibrations.
📬 Weekly AI Paper Digest
Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.