People in meetings usually take turns speaking without uttering simultaneously. Our aim is to clarify the kinds of behavior that contribute to such smooth turn-taking and develop a model for predicting the next speaker and the start time of his or her next utterance in meetings. We empirically demonstrated that gaze behavior and respiration are related to the next speaker and the start time in meetings and can be used to predict them. If such a prediction model were developed, it would lay the foundation for the development of natural conversational systems in which conversational agents speak with natural timing and for teleconference systems that avoid utterance collisions by apprising who of the participants will speak next and adding appropriate time delays.

Please click the thumbnail image to open the full-size PDF file.