information

Type
Conférence scientifique et/ou technique
performance location
Ircam, Salle Igor-Stravinsky (Paris)
duration
01 h 12 min
date
March 6, 2013

This seminar presents research undertaken by the Analysis/Synthesis team in the European project 3DTVS (3D TV Content Search). This projet deals with multimodal search and indexing in 3D TV Content and IRCAM contributes to the project with algorithms that work on the description of the multichannel audio scene. This rather ambitious objective is made tractable by means of focusing on the detection of specific audio events, only.

Two rather complementary techniques are investigated in the project. The first approach is based on audio event detection using
classification methods. The audio events considered are speech and music detection. We introduce a multichannel extension of the present
classification system, “ircamclass” and propose for the extended system several information fusion strategies. These are evaluated on a dataset of 4 films and we show that they give better results than the baseline classification system on a mono down-mix of all channels.

The second approach is based on extensions of nonnegative matrix factorization (NMF) algorithms to multichannel audio resulting in
nonnegative tensor factorization NTF and nonnegative tensor deconvolution (NTD). The NTD algorithm will be used in the project
to detect, localize, and eventually separate, sources of selected audio events.

The presentation will describe the research objectives of the project, the results obtained so far, and an outlook on the results that are expected until the end of the project.

speakers


share


Do you notice a mistake?

IRCAM

1, place Igor-Stravinsky
75004 Paris
+33 1 44 78 48 43

opening times

Monday through Friday 9:30am-7pm
Closed Saturday and Sunday

subway access

Hôtel de Ville, Rambuteau, Châtelet, Les Halles

Institut de Recherche et de Coordination Acoustique/Musique

Copyright © 2022 Ircam. All rights reserved.