Voice activity detection, more commonly known as VAD, is a speech processing technique used to detect the presence or absence of human speech in an audio signal.
The main applications of VADs are in speech coding and recognition, but it can also be used to disable some processes during the non-voice part of an audio session. This feature would thus reduce the CPU load of our algorithms.
In addition, voice activity can be jointly estimated together with other tasks such as noise reduction or other classification. Examples of multi-tasking approaches can be found in.
Subject
The first part of the internship will focus on benchmarking several state-of-the-art techniques (classical signal processing, deep learning and the adaptation of one of these techniques to the needs of A-Volute, the latency and computation cost properties being more important than accuracy for our application. Thus, knowledges in hardware or embedded software would be a plus.
In a second part and if the student has a particular aspiration for machine learning, it will be possible to work on a multi-task approach based on internal work that for the moment focuses on multi-task treatment of music.
Who are we looking for ?
Preparing an engineering degree or master’s degree, or even a PhD (3 month visit), you preferably have knowledge in the development and implementation of advanced algorithms fordigital audio signal processing. In addition, advanced notions in the following various fields would be highly appreciated :
And experiences in the following areas would be a plus :