Testing a spectral-based feature set for audio genre classification

Automatic musical genre classification is an important information retrieval task since it can be applied for practical purposes such as the organization of data collections in the digital music industry. However, this task remains an open question because the current state of the art shows far from...

Full description

Bibliographic Details
Main Author: Hartmann, Martín Ariel
Other Authors: Humanistinen tiedekunta, Faculty of Humanities, Musiikin laitos, Department of Music, University of Jyväskylä, Jyväskylän yliopisto
Format: Master's thesis
Language:eng
Published: 2011
Subjects:
Online Access: https://jyx.jyu.fi/handle/123456789/36531
_version_ 1826225760822624256
author Hartmann, Martín Ariel
author2 Humanistinen tiedekunta Faculty of Humanities Musiikin laitos Department of Music University of Jyväskylä Jyväskylän yliopisto
author_facet Hartmann, Martín Ariel Humanistinen tiedekunta Faculty of Humanities Musiikin laitos Department of Music University of Jyväskylä Jyväskylän yliopisto Hartmann, Martín Ariel Humanistinen tiedekunta Faculty of Humanities Musiikin laitos Department of Music University of Jyväskylä Jyväskylän yliopisto
author_sort Hartmann, Martín Ariel
datasource_str_mv jyx
description Automatic musical genre classification is an important information retrieval task since it can be applied for practical purposes such as the organization of data collections in the digital music industry. However, this task remains an open question because the current state of the art shows far from satisfactory outcomes in terms of classification performance. Moreover, the most common algorithms that are used for this task are not designed for modelling music perception. This study suggests a framework for testing different musical features for use in music genre classification and evaluates the performance of this task based on two musical descriptors. The focus of this study is on automatic classification of music into genres based on audio content. The performance of two sets of timbral descriptors, namely the sub-band fluxes and the mel-frequency cepstral coefficients, is compared. The choice of these particular descriptors is based on their ease or difficulty of interpretation from a perceptual point of view. Classification performance is determined by using a variety of music datasets, learning algorithms, feature selection approaches and combinatorial feature subsets yielded from these descriptors. The results were estimated upon overall classification accuracies, generalization capability, and relevance of these musical descriptors based on feature ranking. According to the results, the sub-band fluxes, perceptually motivated descriptors of polyphonic timbre, performed better than the widely used mel-frequency cepstral coefficients. The former timbral descriptors showed better classification accuracies and lower tendency to overfit than the latter. In a nutshell, this study gives support to using perceptually interpretable timbre desciptors for musical genre classification tasks and suggests the utilization of the sub-band flux set for further content-based tasks in the field of music information retrieval.
first_indexed 2024-09-11T08:49:59Z
format Pro gradu
free_online_boolean 1
fullrecord [{"key": "dc.contributor.author", "value": "Hartmann, Marti\u0301n Ariel", "language": null, "element": "contributor", "qualifier": "author", "schema": "dc"}, {"key": "dc.date.accessioned", "value": "2011-08-03T07:33:32Z", "language": "", "element": "date", "qualifier": "accessioned", "schema": "dc"}, {"key": "dc.date.available", "value": "2011-08-03T07:33:32Z", "language": "", "element": "date", "qualifier": "available", "schema": "dc"}, {"key": "dc.date.issued", "value": "2011", "language": null, "element": "date", "qualifier": "issued", "schema": "dc"}, {"key": "dc.identifier.other", "value": "oai:jykdok.linneanet.fi:1181512", "language": null, "element": "identifier", "qualifier": "other", "schema": "dc"}, {"key": "dc.identifier.uri", "value": "https://jyx.jyu.fi/handle/123456789/36531", "language": "", "element": "identifier", "qualifier": "uri", "schema": "dc"}, {"key": "dc.description.abstract", "value": "Automatic musical genre classification is an important information retrieval task since it can be applied for practical purposes such as the organization of data collections in the digital music industry. However, this task remains an open question because the current state of the art shows far from satisfactory outcomes in terms of classification performance. Moreover, the most common algorithms that are used for this task are not designed for modelling music perception. This study suggests a framework for testing different musical features for use in music genre classification and evaluates the performance of this task based on two musical descriptors.\r\n\r\nThe focus of this study is on automatic classification of music into genres based on audio content. The performance of two sets of timbral descriptors, namely the sub-band fluxes and the mel-frequency cepstral coefficients, is compared. The choice of these particular descriptors is based on their ease or difficulty of interpretation from a perceptual point of view. Classification performance is determined by using a variety of music datasets, learning algorithms, feature selection approaches and combinatorial feature subsets yielded from these descriptors. The results were estimated upon overall classification accuracies, generalization capability, and relevance of these musical descriptors based on feature ranking.\r\n\r\nAccording to the results, the sub-band fluxes, perceptually motivated descriptors of polyphonic timbre, performed better than the widely used mel-frequency cepstral coefficients. The former timbral descriptors showed better classification accuracies and lower tendency to overfit than the latter. \r\n\r\nIn a nutshell, this study gives support to using perceptually interpretable timbre desciptors for musical genre classification tasks and suggests the utilization of the sub-band flux set for further content-based tasks in the field of music information retrieval.", "language": "", "element": "description", "qualifier": "abstract", "schema": "dc"}, {"key": "dc.description.provenance", "value": "Submitted using Plone Publishing form by Hannele Saari (hansaari) on 2011-08-03 07:33:30.576529. Form: Admin-lomake Pro gradu -t\u00f6ille (https://kirjasto.jyu.fi/julkaisut/julkaisulomakkeet/admin-lomake-pro-gradu-toille). JyX data:", "language": "en", "element": "description", "qualifier": "provenance", "schema": "dc"}, {"key": "dc.description.provenance", "value": "Submitted by jyx lomake-julkaisija (jyx-julkaisija@noreply.fi) on 2011-08-03T07:33:32Z\r\nNo. of bitstreams: 2\r\nURN:NBN:fi:jyu-2011080311207.pdf: 1504932 bytes, checksum: 2a77f528da49079c3f6323357654267e (MD5)\r\nlicense.html: 107 bytes, checksum: a7d86e598caa500b1b433bbb9dc8ef1c (MD5)", "language": "en", "element": "description", "qualifier": "provenance", "schema": "dc"}, {"key": "dc.description.provenance", "value": "Made available in DSpace on 2011-08-03T07:33:32Z (GMT). No. of bitstreams: 2\r\nURN:NBN:fi:jyu-2011080311207.pdf: 1504932 bytes, checksum: 2a77f528da49079c3f6323357654267e (MD5)\r\nlicense.html: 107 bytes, checksum: a7d86e598caa500b1b433bbb9dc8ef1c (MD5)\r\n Previous issue date: 2011", "language": "en", "element": "description", "qualifier": "provenance", "schema": "dc"}, {"key": "dc.format.extent", "value": "79 s", "language": null, "element": "format", "qualifier": "extent", "schema": "dc"}, {"key": "dc.format.mimetype", "value": "application/pdf", "language": null, "element": "format", "qualifier": "mimetype", "schema": "dc"}, {"key": "dc.language.iso", "value": "eng", "language": null, "element": "language", "qualifier": "iso", "schema": "dc"}, {"key": "dc.rights", "value": "In Copyright", "language": "en", "element": "rights", "qualifier": null, "schema": "dc"}, {"key": "dc.subject.other", "value": "music information retrieval", "language": null, "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.subject.other", "value": "music genre classification", "language": null, "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.subject.other", "value": "polyphonic timbre", "language": null, "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.subject.other", "value": "feature ranking", "language": null, "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.title", "value": "Testing a spectral-based feature set for audio genre classification", "language": null, "element": "title", "qualifier": null, "schema": "dc"}, {"key": "dc.type", "value": "master thesis", "language": null, "element": "type", "qualifier": null, "schema": "dc"}, {"key": "dc.identifier.urn", "value": "URN:NBN:fi:jyu-2011080311207", "language": null, "element": "identifier", "qualifier": "urn", "schema": "dc"}, {"key": "dc.type.dcmitype", "value": "Text", "language": "en", "element": "type", "qualifier": "dcmitype", "schema": "dc"}, {"key": "dc.type.ontasot", "value": "Pro gradu -tutkielma", "language": "fi", "element": "type", "qualifier": "ontasot", "schema": "dc"}, {"key": "dc.type.ontasot", "value": "Master\u2019s thesis", "language": "en", "element": "type", "qualifier": "ontasot", "schema": "dc"}, {"key": "dc.contributor.faculty", "value": "Humanistinen tiedekunta", "language": "fi", "element": "contributor", "qualifier": "faculty", "schema": "dc"}, {"key": "dc.contributor.faculty", "value": "Faculty of Humanities", "language": "en", "element": "contributor", "qualifier": "faculty", "schema": "dc"}, {"key": "dc.contributor.department", "value": "Musiikin laitos", "language": "fi", "element": "contributor", "qualifier": "department", "schema": "dc"}, {"key": "dc.contributor.department", "value": "Department of Music", "language": "en", "element": "contributor", "qualifier": "department", "schema": "dc"}, {"key": "dc.contributor.organization", "value": "University of Jyv\u00e4skyl\u00e4", "language": "en", "element": "contributor", "qualifier": "organization", "schema": "dc"}, {"key": "dc.contributor.organization", "value": "Jyv\u00e4skyl\u00e4n yliopisto", "language": "fi", "element": "contributor", "qualifier": "organization", "schema": "dc"}, {"key": "dc.subject.discipline", "value": "Music, Mind and Technology (maisteriohjelma)", "language": "fi", "element": "subject", "qualifier": "discipline", "schema": "dc"}, {"key": "dc.subject.discipline", "value": "Master's Degree Programme in Music, Mind and Technology", "language": "en", "element": "subject", "qualifier": "discipline", "schema": "dc"}, {"key": "dc.subject.method", "value": "mallintaminen", "language": "", "element": "subject", "qualifier": "method", "schema": "dc"}, {"key": "dc.date.updated", "value": "2011-08-03T07:33:32Z", "language": "", "element": "date", "qualifier": "updated", "schema": "dc"}, {"key": "dc.type.coar", "value": "http://purl.org/coar/resource_type/c_bdcc", "language": null, "element": "type", "qualifier": "coar", "schema": "dc"}, {"key": "dc.rights.accesslevel", "value": "openAccess", "language": "fi", "element": "rights", "qualifier": "accesslevel", "schema": "dc"}, {"key": "dc.type.publication", "value": "masterThesis", "language": null, "element": "type", "qualifier": "publication", "schema": "dc"}, {"key": "dc.subject.oppiainekoodi", "value": "3054", "language": null, "element": "subject", "qualifier": "oppiainekoodi", "schema": "dc"}, {"key": "dc.subject.yso", "value": "musiikki", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "genret", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "s\u00e4hk\u00f6iset palvelut", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "luokitus", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.format.content", "value": "fulltext", "language": null, "element": "format", "qualifier": "content", "schema": "dc"}, {"key": "dc.rights.url", "value": "https://rightsstatements.org/page/InC/1.0/", "language": null, "element": "rights", "qualifier": "url", "schema": "dc"}, {"key": "dc.type.okm", "value": "G2", "language": null, "element": "type", "qualifier": "okm", "schema": "dc"}]
id jyx.123456789_36531
language eng
last_indexed 2025-02-18T10:56:23Z
main_date 2011-01-01T00:00:00Z
main_date_str 2011
online_boolean 1
online_urls_str_mv {"url":"https:\/\/jyx.jyu.fi\/bitstreams\/d12d32ff-784c-442c-9fdc-a31d71fa5fce\/download","text":"URN:NBN:fi:jyu-2011080311207.pdf","source":"jyx","mediaType":"application\/pdf"}
publishDate 2011
record_format qdc
source_str_mv jyx
spellingShingle Hartmann, Martín Ariel Testing a spectral-based feature set for audio genre classification music information retrieval music genre classification polyphonic timbre feature ranking Music, Mind and Technology (maisteriohjelma) Master's Degree Programme in Music, Mind and Technology mallintaminen 3054 musiikki genret sähköiset palvelut luokitus
title Testing a spectral-based feature set for audio genre classification
title_full Testing a spectral-based feature set for audio genre classification
title_fullStr Testing a spectral-based feature set for audio genre classification Testing a spectral-based feature set for audio genre classification
title_full_unstemmed Testing a spectral-based feature set for audio genre classification Testing a spectral-based feature set for audio genre classification
title_short Testing a spectral-based feature set for audio genre classification
title_sort testing a spectral based feature set for audio genre classification
title_txtP Testing a spectral-based feature set for audio genre classification
topic music information retrieval music genre classification polyphonic timbre feature ranking Music, Mind and Technology (maisteriohjelma) Master's Degree Programme in Music, Mind and Technology mallintaminen 3054 musiikki genret sähköiset palvelut luokitus
topic_facet 3054 Master's Degree Programme in Music, Mind and Technology Music, Mind and Technology (maisteriohjelma) feature ranking genret luokitus mallintaminen music genre classification music information retrieval musiikki polyphonic timbre sähköiset palvelut
url https://jyx.jyu.fi/handle/123456789/36531 http://www.urn.fi/URN:NBN:fi:jyu-2011080311207
work_keys_str_mv AT hartmannmartinariel testingaspectralbasedfeaturesetforaudiogenreclassification