resource center

Extracting lightweight neural networks from large models for embedded audio applications - Questions from the jury

video

audiovisual archive
information
Work by
speakers
recommendations

Do you notice a mistake?

information

event: Soutenance de thèse de David Genova
Type: Soutenance de thèse/HDR
performance location: Ircam, Salle Igor-Stravinsky (Paris)
date: October 20, 2025

David Genova's thesis defense

David Genova is a doctoral student in the Sound Analysis and Synthesis team, enrolled in the EDITE doctoral school (ED130) at Sorbonne University. A music enthusiast, his research focuses on the integration of neural networks in audio synthesizers, as well as interpretability in artificial intelligence. He's defending his thesis, entitled “Extracting lightweight neural networks from large models for embedded audio applications”.

The jury is composed of:

Thor Magnusson, reviewer, University of Sussex and University of Iceland
Nick Bryan-Kinns, reviewer, University of the Arts London
Mark Sandler, examiner, Queen Mary University of London
Irina Illina, examiner, LORIA INRIA Nancy
Philippe Esling, director, Sorbonne Université
Philippe Codognet, co-director, JFLI, Sorbonne Université
Tom Hurlin, co-director, Squarp Instruments

Abstract:
Advances in artificial intelligence have led to numerous applications in creative contexts, particularly in the field of music. However, the computational complexity of neural networks prevents their use in embedded architectures, such as those typically employed in synthesizers. This limitation represents a major obstacle to the development of musical instruments that fully exploit the creative potential offered by such models. This thesis focuses on the design of lightweight and efficient neural networks through the trimming of overparameterized models. Our work is based on the hypothesis that the two main effects of overparameterization are, on the one hand, a high level of redundancy within intermediate representations, and on the other hand, the over-specialization of certain units. This observation led to the development of a learning-based pruning strategy, enabling the extraction of subnetworks tailored to specific tasks and data. Applied to generative audio models, this strategy makes it possible to obtain subnetworks that preserve high generation quality while remaining compatible with the computational resources of several embedded architectures. This work led to the creation of JUNK, a synthesizer that leverages the advantages of neural audio synthesis, and designed for musical use in both composition and performance.