ROMÁN ECHEVERRI, CARLOS GUSTAVO (Fundación Universitaria San Martín, Bogotá)
HERRERA, PERFECTO (Universitat Pompeu Fabra/Escola Superior de Música de Catalunya)
Automatic description of music for analyzing music productions: a study case in detecting Mellotron sounds in recordings
Download the entire presentation as a video with Carlos’ audio commentary here.
[abstract]In the last few years, digital music collections have become available via global networks in constantly- increasing amounts, prompted by recent developments in audio technology and the appearance of innovative online distribution platforms. Music Information Retrieval (MIR) -a growing and active interdisciplinary field of research- aims precisely at the problem of describing, organizing, categorizing, browsing and taking advantage of these large bulk of data in different contexts (analysis, exploration, recommendation, creation). Analyzing music recordings is now possible beyond the limitations of classical features (e.g., sonogram features) and collection sizes (i.e., a human-manegeable bunch of files). Therefore, an audio recording can be characterized automatically (with a non-negligible amount of errors that could require human supervision) with their music theoretical features (pitch, scales, chords, rhythm), similarity with other recordings, genre, production techniques or musical instruments. In this context, the detection of musical instruments in a specific piece of music might be highly relevant in the analysis of music recordings, as instruments define the timbral qualities in any piece of music. Perceptually, instruments are determinant of specific textures, atmospheres, contrasts and distinctiveness in a piece of music. Additionally, instruments give information on the genre, the historical and geographical origin of music. In order to detect a musical instrument in a recording, the acoustic features that make the sound of an instrument identifiable or remarkable must be found. To accomplish this, audio descriptors describing different timbre dimensions are extracted, quantified and coded from raw digital audio signals.
In this paper we focus the detection of musical instruments on the mellotron, one of the first sample playback instruments in history, which has been widely employed in several forms of popular music since the sixties. The mellotron has been used in music genres as diverse as progressive rock, psychedelic, alternative, art rock, electronica or ambient, and still continues to be used to this day. Mellotron sounds present interesting technical and perceptual qualities, which make it ideal for the study of timbre descriptors in the context of automatic classification in polyphonic audio. For instance, the electro-mechanical tape mechanism imprints an unified sound to the mellotron, disregarding the instrument being sampled. Once audio descriptors have been extracted from audio excerpts containing mellotron sounds, it is possible to train automatic classifiers to detect other fragments containing such kind of sounds (more than 350 excerpts from recordings featuring strings, flute and choir mellotron sounds were employed in our experiments, achieving recognition rates ranging approximately from 68% to 91% for different models and collections). These experiments show that it is possible to automatically identify this instrument in a polyphonic setting (with some margin of error), and indicate audio descriptors that could be probably related to the physical mechanism that enables the mellotron to generate its distinctive sound.