ArXiv TLDR

Does the TalkMoves Codebook Generalize to One-on-One Tutoring and Multimodal Interaction?

🐦 Tweet
2604.13380

Corina Luca Focsan, Marie Cynthia Abijuru Kamikazi, Tamisha Thompson, Jennifer St. John, Kirk Vanacore + 3 more

cs.HC

TLDR

This study investigates if the TalkMoves codebook, designed for classrooms, reliably applies to one-on-one tutoring with multimodal data.

Key contributions

  • TalkMoves achieved higher inter-rater reliability (k=0.74) than a hybrid AI-human codebook (k=0.64).
  • The AI-human codebook demonstrated broader empirical coverage and higher perceived usability across modalities.
  • Both codebooks undercaptured tutoring-relevant moves and struggled with nonverbal/multimodal actions.
  • Findings motivate developing modality-aware, tutoring-grounded codebooks for diverse platforms.

Why it matters

This research is crucial for developing effective tools to analyze and support one-on-one tutoring. It reveals limitations of existing classroom-centric codebooks when applied to diverse tutoring modalities, guiding future codebook development.

Original Abstract

Accountable Talk theory has been widely adopted to analyze classroom discourse and is increasingly used to annotate tutoring interactions. In particular, the TalkMoves codebook, grounded in Accountable Talk theory, is commonly used to label tutoring data and train models of effective instructional support. However, Accountable Talk was originally developed to characterize collaborative, whole-classroom oral discourse, not to identify talk moves in one-on-one tutoring environments using multimodal data (e.g., video, audio, chat). As tutoring platforms expand in scale and modality, questions remain about whether Accountable Talk-based codebooks generalize reliably beyond their original classroom context and data representation. This study examines whether the human-developed TalkMoves codebook generalizes in reliability, utility, and interpretability when applied to one-on-one tutoring across audio, chat, and multimodal data. We compare TalkMoves with a hybrid AI-human developed codebook using a workflow established in prior research. Two expert annotators with over 20 years of teaching experience applied both codebooks to six tutoring sessions spanning three modalities: chat-based, audio-only, and multimodal interactions. Results show that while Talk-Moves achieved higher overall inter-rater reliability than the AI-human codebook (k = 0.74 vs. 0.64), the AI-human codebook demonstrated broader empirical coverage and higher perceived usability across modalities. Both codebooks undercaptured tutoring-relevant moves and introduced ambiguity when identifying actions expressed through nonverbal and multimodal artifacts. Together, these findings highlight the uneven generalizability of TalkMoves to tutoring contexts and motivate the development of modality-aware, tutoring-grounded codebooks.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.