Segmentation of electronic dance music. / Scarfe, Tim; Koolen, Wouter; Kalnishkan, Yuri.

In: International Journal of Engineering Intelligent Systems for Electrical Engineering and Communications, Vol. 22, No. 3/4, 2014.

Research output: Contribution to journalArticle

Published

Standard

Segmentation of electronic dance music. / Scarfe, Tim; Koolen, Wouter; Kalnishkan, Yuri.

In: International Journal of Engineering Intelligent Systems for Electrical Engineering and Communications, Vol. 22, No. 3/4, 2014.

Research output: Contribution to journalArticle

Harvard

Scarfe, T, Koolen, W & Kalnishkan, Y 2014, 'Segmentation of electronic dance music', International Journal of Engineering Intelligent Systems for Electrical Engineering and Communications, vol. 22, no. 3/4.

APA

Scarfe, T., Koolen, W., & Kalnishkan, Y. (2014). Segmentation of electronic dance music. International Journal of Engineering Intelligent Systems for Electrical Engineering and Communications, 22(3/4).

Vancouver

Scarfe T, Koolen W, Kalnishkan Y. Segmentation of electronic dance music. International Journal of Engineering Intelligent Systems for Electrical Engineering and Communications. 2014;22(3/4).

Author

Scarfe, Tim ; Koolen, Wouter ; Kalnishkan, Yuri. / Segmentation of electronic dance music. In: International Journal of Engineering Intelligent Systems for Electrical Engineering and Communications. 2014 ; Vol. 22, No. 3/4.

BibTeX

@article{69c92cdab97d4dfaaf7d3495ab35a32a,
title = "Segmentation of electronic dance music",
abstract = "We consider the problem of annotating song changes in DJ-mixed dance music recordings (pod-casts, radio shows, live events). It is an extremely laborious process to perform this task manually. We present an algorithm to reconstruct segment boundaries as close as possible to what a human domain expert would create in respect of the same task given a fixed number of boundaries. The algorithm is optimized for the scenario when the number of tracks is known a priori although is also capable of estimating the number of tracks and is evaluated in both circumstances. As the number of segments is known in advance we do not have to rely on local points-of-change heuristics prevalent in common segmentation algorithms. The goal of DJ-mixing is to render track boundaries effectively invisible from human perception. Segmentation is performed on a self-similarity matrix which is derived from normalized cosines of various cost matrices which have themselves been derived from a time-series of Fourier based spectral features. The cost matrices introduced in this paper introduce notions of general self-similarity and also specific notions such as; symmetry, contiguity and evolution in respect of time. The segmentation configuration is parametrized and an evolutionary algorithm is executed on a small test set to find optimal parameters for the task of segmentation. Our work is quantitatively assessed on a large corpus (640 hours) of radio show recordings which have been hand-labelled by a domain expert. The method presented could be used on other segmentation tasks and other domains. ",
author = "Tim Scarfe and Wouter Koolen and Yuri Kalnishkan",
year = "2014",
language = "English",
volume = "22",
journal = "International Journal of Engineering Intelligent Systems for Electrical Engineering and Communications",
issn = "1472-8915",
publisher = "CRL Publishing",
number = "3/4",

}

RIS

TY - JOUR

T1 - Segmentation of electronic dance music

AU - Scarfe, Tim

AU - Koolen, Wouter

AU - Kalnishkan, Yuri

PY - 2014

Y1 - 2014

N2 - We consider the problem of annotating song changes in DJ-mixed dance music recordings (pod-casts, radio shows, live events). It is an extremely laborious process to perform this task manually. We present an algorithm to reconstruct segment boundaries as close as possible to what a human domain expert would create in respect of the same task given a fixed number of boundaries. The algorithm is optimized for the scenario when the number of tracks is known a priori although is also capable of estimating the number of tracks and is evaluated in both circumstances. As the number of segments is known in advance we do not have to rely on local points-of-change heuristics prevalent in common segmentation algorithms. The goal of DJ-mixing is to render track boundaries effectively invisible from human perception. Segmentation is performed on a self-similarity matrix which is derived from normalized cosines of various cost matrices which have themselves been derived from a time-series of Fourier based spectral features. The cost matrices introduced in this paper introduce notions of general self-similarity and also specific notions such as; symmetry, contiguity and evolution in respect of time. The segmentation configuration is parametrized and an evolutionary algorithm is executed on a small test set to find optimal parameters for the task of segmentation. Our work is quantitatively assessed on a large corpus (640 hours) of radio show recordings which have been hand-labelled by a domain expert. The method presented could be used on other segmentation tasks and other domains.

AB - We consider the problem of annotating song changes in DJ-mixed dance music recordings (pod-casts, radio shows, live events). It is an extremely laborious process to perform this task manually. We present an algorithm to reconstruct segment boundaries as close as possible to what a human domain expert would create in respect of the same task given a fixed number of boundaries. The algorithm is optimized for the scenario when the number of tracks is known a priori although is also capable of estimating the number of tracks and is evaluated in both circumstances. As the number of segments is known in advance we do not have to rely on local points-of-change heuristics prevalent in common segmentation algorithms. The goal of DJ-mixing is to render track boundaries effectively invisible from human perception. Segmentation is performed on a self-similarity matrix which is derived from normalized cosines of various cost matrices which have themselves been derived from a time-series of Fourier based spectral features. The cost matrices introduced in this paper introduce notions of general self-similarity and also specific notions such as; symmetry, contiguity and evolution in respect of time. The segmentation configuration is parametrized and an evolutionary algorithm is executed on a small test set to find optimal parameters for the task of segmentation. Our work is quantitatively assessed on a large corpus (640 hours) of radio show recordings which have been hand-labelled by a domain expert. The method presented could be used on other segmentation tasks and other domains.

M3 - Article

VL - 22

JO - International Journal of Engineering Intelligent Systems for Electrical Engineering and Communications

JF - International Journal of Engineering Intelligent Systems for Electrical Engineering and Communications

SN - 1472-8915

IS - 3/4

ER -