Benutzer:IHEARu

aus Wikipedia, der freien Enzyklopädie
Datei:Logo of iHEARu-Project.png
Logo of iHEARu-Project

The iHEARu project (Intelligent systems' Holistic Evolving Analysis of Real-life Universal speaker characteristics) aims to push the limits of intelligent systems for computational paralinguistics by considering holistic analysis of multiple speaker attributes at once, evolving and self-learning, deeper analysis of acoustic parameters - all on realistic data on a large scale, ultimately progressing from individual analysis tasks towards universal speaker characteristics analysis, which can easily learn about and can be adapted to new, previously unexplored characteristics. This project has received funding from the European Union’s Seventh Framework Programme for research, technological development and demonstration under grant agreement no 338164 (ERC Starting Grant iHEARu).

Scheme of speech recognition system

In the iHEARu project, ground-breaking methodology including novel techniques for multi-task and semi-supervised learning will deliver for the first time intelligent holistic and evolving analysis in real-life condition of universal speaker characteristics which have been considered only in isolation so far. Today's sparseness of annotated realistic speech data will be overcome by large-scale speech and meta-data mining from public sources such as social media, crowd-sourcing for labelling and quality control, and shared semi-automatic annotation. All stages from pre-processing and feature extraction, to the statistical modelling will evolve in "life-long learning" according to new data, by utilising feedback, deep, and evolutionary learning methods. Human-in-the-loop system validation and novel perception studies will analyse the self-organising systems and the relation of automatic signal processing to human interpretation in a previously unseen variety of speaker classification tasks.


iHEARu-PLAY is being developed and run as part of a collaboration of the Chair for Complex and Intelligent Systems at the University of Passau and the Institute for Human-Machine Communication at the Technische Universität München. It's a crowdsourcing-based scientific game, where players listen to collections of short audio recordings, watch collections of video recordings or images and provide answers to multiple questions regarding the content of the samples. Further, players can perform prompted recording tasks for different topics. Depending on their own choices and those of other players, players are awarded with game points, can reach different experience levels and are ranked on the leaderboard accordingly. The annotated data will then be used to train the classifier via machine learning algorithms to automate a better recognition of certain criteria of new data files without human intervention. This allows us to deliver better results in Voice Analyse Application (VoiLA) and enables us to provide new functions in the future.

External Links