In this project, the HCI area of La Salle R&D has developed the first software that generates customizable synthetic voice messages through the combination of text-to-speech (TTS) synthesis and voice transformation (VT). This technology produces synthetic speech from any input text using a simple user interface designed to intuitively modify voice personalization parameters. Different user-defined voice presets can be generated and saved for future. The technology is a multi-platform software implemented following a client-server architecture. Speech synthesis is done using a TTS system installed on a server that receives requests from the client application, while voice customization is performed by the voice transformation module running in the client application. A non-expert user can easily define, modify and save, the specific settings of each voice preset using a graphical user interface (GUI).
The technology is a multi-platform software implemented following a client-server architecture. Speech synthesis is done using a TTS system installed on a server that receives requests from the client application, while voice customization is performed by the voice transformation module running in the client application. A non-expert user can easily define, modify and save, the specific settings of each voice preset using a graphical user interface (GUI).
People
Technician
Àngel Calzada
Technician
Marc Freixes
Technician
David Cacenabes
Technician
Miquel Noè
Project manager
Rosa Maria Alsina
Project manager
Francesc Enrich
Project leader
Joan Claudi Socoró
Project coordinator
Francesc Alías
Publication
Calzada, A.;Socoró, J.C., "Voice Quality Modification Using a Harmonics Plus Noise Model". Cognitive Computation, vol. 5, Num. 4, pp. 473-482, 2013.