PHD defence : Energy-based modeling and simulation for sound synthesis: application to a quasi-1D vocal apparatus - Part 1

video

information

Type
Soutenance de thèse/HDR
performance location
Centre Georges Pompidou, Salle Triangle (Paris)
date
October 3, 2025

This work addresses the energy-consistent modeling and simulation of the vocal apparatus. Voice production arises from the nonlinear interaction of multiple physical domains, including fluid dynamics, tissue mechanics, and acoustics. To capture this complexity while preserving energetic consistency, the port-Hamiltonian formalism is employed throughout the thesis. This framework allows the modular construction of individual components and their interconnection within a unified power-balanced system.
Particular attention is given to the construction and selection of a suitable quasi-one-dimensional fluid model to represent airflow through the entire vocal apparatus, from the subglottal region to the lips. Numerical simulations are conducted under various configurations, including isolated vocal tract and isolated auto-oscillating larynx setups, and the results are compared with reference data from the literature. The explicit power signals provided by the port-Hamiltonian formulation offer valuable insight into the energetic behavior of the voice production mechanism. Additionally, a linearized model of the vocal tract is developed as the basis for a real-time implementation. As a final demonstration, the full coupled model (larynx and vocal tract) is used to simulate the articulation of a diphthong under auto-oscillating conditions.
A second part of the work focuses on the development of efficient numerical methods for the stable simulation of port-Hamiltonian systems. In particular, a contribution is made to the scalar auxiliary variable (SAV) methods through the introduction of a novel stabilization term. The proposed approach mitigates the drift of the auxiliary variable, thereby improving the long-term stability of simulations. While direct application to the vocal apparatus model remains a goal for future work, the method is demonstrated on a nonlinear string model, showcasing its effectiveness. A real-time implementation and a prototype control interface are also presented.

Thomas Risse's thesis defence

Thomas Risse, a doctoral student at the SMAER Doctoral School, conducted his research entitled "Energy-based modeling and simulation for sound synthesis: application to a quasi-1D vocal apparatus" within the S3AM team at STMS (IRCAM, Sorbonne University, CNRS, Ministry of Culture) under the supervision of Thomas Hélie and co-supervised by Fabrice Silva.

The jury is composed of:

  • Brad STORY, Reviewer
  • Paul KOTYCZKA, Reviewer
  • Michele DUCCESCHI, Examinator
  • Nathalie HENRICH BERNARDONI, Examinator
  • Yann LE GORREC, Examinator
  • David ROZE, Examinator

Summary
This work addresses the energy-consistent modeling and simulation of the vocal apparatus. Voice production arises from the nonlinear interaction of multiple physical domains, including fluid dynamics, tissue mechanics, and acoustics. To capture this complexity while preserving energetic consistency, the port-Hamiltonian formalism is employed throughout the thesis. This framework allows the modular construction of individual components and their interconnection within a unified power-balanced system.
Particular attention is given to the construction and selection of a suitable quasi-one-dimensional fluid model to represent airflow through the entire vocal apparatus, from the subglottal region to the lips. Numerical simulations are conducted under various configurations, including isolated vocal tract and isolated auto-oscillating larynx setups, and the results are compared with reference data from the literature. The explicit power signals provided by the port-Hamiltonian formulation offer valuable insight into the energetic behavior of the voice production mechanism. Additionally, a linearized model of the vocal tract is developed as the basis for a real-time implementation. As a final demonstration, the full coupled model (larynx and vocal tract) is used to simulate the articulation of a diphthong under auto-oscillating conditions.
A second part of the work focuses on the development of efficient numerical methods for the stable simulation of port-Hamiltonian systems. In particular, a contribution is made to the scalar auxiliary variable (SAV) methods through the introduction of a novel stabilization term. The proposed approach mitigates the drift of the auxiliary variable, thereby improving the long-term stability of simulations. While direct application to the vocal apparatus model remains a goal for future work, the method is demonstrated on a nonlinear string model, showcasing its effectiveness. A real-time implementation and a prototype control interface are also presented.

speakers


share


Do you notice a mistake?

IRCAM

1, place Igor-Stravinsky
75004 Paris
+33 1 44 78 48 43

opening times

Monday through Friday 9:30am-7pm
Closed Saturday and Sunday

subway access

Hôtel de Ville, Rambuteau, Châtelet, Les Halles

Institut de Recherche et de Coordination Acoustique/Musique

Copyright © 2022 Ircam. All rights reserved.