ESR13: Parametric dialogue synthesis: from separate speakers to conversational interaction

PhD Fellow: Adaeze Adigwe


The goal of ESR13’s project is to produce a naturally sounding synthesis voice with appropriately rendered turn-takes, filled pauses, and backchannels in a conversational, expressive speaking style. ESR13 will collect acoustic data from human-human dialogues, extract cues relevant to turn-taking, and train an adaptive model to go from standard news-reading style TTS to conversational dialogue TTS. The ultime objective will be to develop a working platform for synthesizing dialogues (using parametric synthesis technology) that can serve as a testing ground for evaluating hypotheses and models of interaction within COBRA.

Expected results:

  • Fully operational high quality synthesis platform for scripted dialogues;
  • Understanding of signal parameters in terms of dialogue rather than individual participants.

Based in Huis ter Heide, the Netherlands

Full-time three-year contract, starting September 2020

PhD enrolment at: University of Helsinki, Finland

Main supervisors’ institutions: ReadSpeaker, Huis ter Heide, the Netherlands, and the University of Helsinki, Finland

Main supervisors: Dr Esther Klabbers and Prof Juraj Šimko


  • University of Helsinki: to analyse prosodic cues and learn the continuous wavelet transform methodology for analysis and representation of prosody (5,5 months);
  • IISAS, Bratislava: training on turn-taking behaviour (5 months).

Co-supervisor’s institution:

  • IISAS, Bratislava, Slovakia

Scroll to top