The Laboratory for the Recognition and Organization of Speech and Audio (LabROSA) conducts research into automatic means of extracting useful information from sound. Our vision is of an intelligent 'machine listener', able to interpret live or recorded sound of any type in terms of the descriptions and abstractions that would make sense to a human listener. Our research areas include:
- speech, to extract the words, prosodics, speaker characteristics, etc.
- music, including transcription, classification, and similarity estimation
- environmental sound, such as everyday acoustic ambiences, or even from atypical environments including underwater
- sound mixtures, composed of any or all of the above, where the challenge is extracting whatever information is available when observations are partial or obscured.
Applications for automatic high-level sound analysis to be developed include:
- indexing, summarization and searching within large audio archives, such as recorded broadcasts, film catalogs, personal recording devices etc.
- intelligent interaction technologies that have an 'awareness' of their acoustic environment, and can react appropriately
- automatic monitoring devices e.g. for rapid response to emergencies in public complexes.
- intelligent handling of audio and music content, including content-based retrieval, annotation, and recommendation.