resource center

Separating Explicit and Implicit Controls for Expressive Real-Time Neural Synthesis - Part 1

video

audiovisual archive
information
Work by
speakers
recommendations

Do you notice a mistake?

information

event: Soutenance de thèse de Nils Demerlé
Type: Soutenance de thèse/HDR
performance location: Ircam, Salle Igor-Stravinsky (Paris)
date: October 31, 2025

Recent advances in machine learning have profoundly transformed our relationship with sound and musical creation. Deep generative models are emerging as powerful tools that can support and extend creative practices, yet their adoption by artists remains limited by the question of control. Current approaches either rely on explicit parameters (notes, instruments, textual descriptions) or on abstract representation spaces that enable the exploration of subjective concepts such as timbre and style, but are harder to integrate into musical workflows.This thesis aims to reconcile these two paradigms of explicit and implicit control to design expressive audio synthesis tools that can be seamlessly integrated into music production environments. We begin with a systematic study of neural audio codecs, the building blocks of most modern generative models, identifying design choices that influence both audio quality and controllability. We then explore methods to jointly learn explicit and implicit control spaces, first in a supervised setting, and later through AFTER, a framework designed for the unsupervised case. AFTER enables realistic and continuous timbre transfer across a wide range of instruments while preserving control over pitch and rhythm.Finally, we adapt these models for real-time use through lightweight, streamable diffusion architectures and develop an intuitive interface integrated into digital audio workstations. The thesis concludes with several artistic collaborations, demonstrating the creative potential and practical impact of these generative approaches.

Nils Demerlé's thesis defense

Nils Demerlé, PhD candidate within the EDITE doctoral school (ED 130), conducted his research entitled “Separating Explicit and Implicit Controls for Expressive Real-Time Neural Synthesis” as part of the Analysis–Synthesis team at the STMS Laboratory (IRCAM, CNRS, Sorbonne Université, Ministry of Culture), under the supervision of Philippe Esling and co-supervision of Guillaume Doras.

Jury composition:

Joshua REISS – Professor, Queen Mary University of London – Reviewer
Nao TOKUI – Artist and Researcher, Neutone – Reviewer
Anna HUANG – Assistant Professor, MIT – Examiner
Atau TANAKA – Professor, Goldsmiths University – Examiner
Tatsuya HARADA – Professor, University of Tokyo – Examiner
Alexandre DEFOSSEZ – Researcher, Kyutai – Examiner

Abstract:
Recent advances in machine learning have profoundly transformed our relationship with sound and musical creation. Deep generative models are emerging as powerful tools that can support and extend creative practices, yet their adoption by artists remains limited by the question of control. Current approaches either rely on explicit parameters (notes, instruments, textual descriptions) or on abstract representation spaces that enable the exploration of subjective concepts such as timbre and style, but are harder to integrate into musical workflows.This thesis aims to reconcile these two paradigms of explicit and implicit control to design expressive audio synthesis tools that can be seamlessly integrated into music production environments. We begin with a systematic study of neural audio codecs, the building blocks of most modern generative models, identifying design choices that influence both audio quality and controllability. We then explore methods to jointly learn explicit and implicit control spaces, first in a supervised setting, and later through AFTER, a framework designed for the unsupervised case. AFTER enables realistic and continuous timbre transfer across a wide range of instruments while preserving control over pitch and rhythm.Finally, we adapt these models for real-time use through lightweight, streamable diffusion architectures and develop an intuitive interface integrated into digital audio workstations. The thesis concludes with several artistic collaborations, demonstrating the creative potential and practical impact of these generative approaches.

speakers

Nils Demerlé

lecturer

From the same archive

Séparation de contrôles explicites et implicites pour la synthèse neurale expressive en temps réel - Questions du Jury - Nils Demerlé

Video

October 31, 2025

Video

Do you notice a mistake?

IRCAM

1, place Igor-Stravinsky
75004 Paris
+33 1 44 78 48 43

opening times

Monday through Friday 9:30am-7pm
Closed Saturday and Sunday

subway access

Hôtel de Ville, Rambuteau, Châtelet, Les Halles

Institut de Recherche et de Coordination Acoustique/Musique

Separating Explicit and Implicit Controls for Expressive Real-Time Neural Synthesis - Part 1

Separating Explicit and Implicit Controls for Expressive Real-Time Neural Synthesis - Part 1

information

Nils Demerlé's thesis defense

speakers

From the same archive

Séparation de contrôles explicites et implicites pour la synthèse neurale expressive en temps réel - Questions du Jury - Nils Demerlé

share

IRCAM

opening times

subway access