Exploring Moderation Consistency and Relaxed Think-Aloud in AI-Moderated Usability Studies

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

This thesis explores how a large language model (GPT-4o) can be trained and evaluated as a professional moderator in usability testing sessions conducted using the relaxed think-aloud protocol. While large language models (LLMs) are often employed in assistive roles like customer service or education, this study investigates their potential to adopt a non-directive and neutral tone suitable for research moderation. Informed by principles from usability studies facilitation concept, GPT-4o was trained through prompt-based customization to emulate human moderation strategies, including open-ended questioning, tone neutrality, and sensitivity to user pacing. Nine usability sessions were conducted on an e-commerce website, and AI–user dialogues were transcribed and coded using a reflexive thematic framework. The analysis focused on AI prompt style, timing, and tone, with user responses categorized into three verbalization levels (real-time observations, interpretations, and elaborations). Findings reveal that GPT-4o can sustain Level 1–3 verbalizations through non-leading moderation techniques, though occasional lapses into directive or overly affirming tones were observed. This study contributes to emerging research on AI-moderated usability testing and demonstrates how prompt-engineered LLMs can approach human-like moderation, offering insights into both the design and evaluation of conversational agents in research contexts.

Description

Keywords

protocol analysis, relaxed think-aloud protocol, ChatGPT advanced voice mode, AI-moderated usability testing, training conversational LLMs

Citation

ISBN

Articles

Department

Defence location

Endorsement

Review

Supplemented By

Referenced By