Puhujariippuvainen puhekomentojentunnistus neuroverkoilla

Tässä tutkimuksessa etsittiin puhekomennontunnistusmallia, joka voitaisiin kouluttaa pienellä määrällä äänitteitä tunnistamaan muutamia ennalta määrättyjä tietyn henkilön komentoja. Kolmea puhujariippuvaisella datalla koulutettua neuroverkkomallia vertailtiin muun muassa tunnistustarkkuuden ja tunni...

Täydet tiedot

Bibliografiset tiedot
Päätekijä: Nummelin, Panu
Muut tekijät: Informaatioteknologian tiedekunta, Faculty of Information Technology, Informaatioteknologia, Information Technology, Jyväskylän yliopisto, University of Jyväskylä
Aineistotyyppi: Pro gradu
Kieli:fin
Julkaistu: 2021
Aiheet:
Linkit: https://jyx.jyu.fi/handle/123456789/76696
_version_ 1828193063197999104
author Nummelin, Panu
author2 Informaatioteknologian tiedekunta Faculty of Information Technology Informaatioteknologia Information Technology Jyväskylän yliopisto University of Jyväskylä
author_facet Nummelin, Panu Informaatioteknologian tiedekunta Faculty of Information Technology Informaatioteknologia Information Technology Jyväskylän yliopisto University of Jyväskylä Nummelin, Panu Informaatioteknologian tiedekunta Faculty of Information Technology Informaatioteknologia Information Technology Jyväskylän yliopisto University of Jyväskylä
author_sort Nummelin, Panu
datasource_str_mv jyx
description Tässä tutkimuksessa etsittiin puhekomennontunnistusmallia, joka voitaisiin kouluttaa pienellä määrällä äänitteitä tunnistamaan muutamia ennalta määrättyjä tietyn henkilön komentoja. Kolmea puhujariippuvaisella datalla koulutettua neuroverkkomallia vertailtiin muun muassa tunnistustarkkuuden ja tunnistusnopeuden suhteen. Tutkimuksessa parhaitenkin suoriutunut malli todettiin liian epäluotettavaksi käytännön käyttöön. Mahdollisiksi tavoiksi parantaa mallia esitettiin muuksi kuin komennoiksi luokiteltavan äänidatan ottaminen osaksi koulutusta ja data-augmentaatio. In this study a speech command recognition model that could be trained with small amount of data to recognize a few predefined commands spoken by a specific person was sought. Three neural network models trained with speaker-dependent data were compared by their recognition accuracy and inference speed among other metrics. Even the best performing model of the study was deemed to be unsuitable for practical application. Integrating non-command speech data into the training process and data-augmentation were brought up as possible ways to improve the model's performance.
first_indexed 2021-06-18T20:01:29Z
format Pro gradu
free_online_boolean 1
fullrecord [{"key": "dc.contributor.advisor", "value": "P\u00f6l\u00f6nen, Ilkka", "language": "", "element": "contributor", "qualifier": "advisor", "schema": "dc"}, {"key": "dc.contributor.advisor", "value": "Hakanen, Jussi", "language": "", "element": "contributor", "qualifier": "advisor", "schema": "dc"}, {"key": "dc.contributor.author", "value": "Nummelin, Panu", "language": "", "element": "contributor", "qualifier": "author", "schema": "dc"}, {"key": "dc.date.accessioned", "value": "2021-06-18T07:42:14Z", "language": null, "element": "date", "qualifier": "accessioned", "schema": "dc"}, {"key": "dc.date.available", "value": "2021-06-18T07:42:14Z", "language": null, "element": "date", "qualifier": "available", "schema": "dc"}, {"key": "dc.date.issued", "value": "2021", "language": "", "element": "date", "qualifier": "issued", "schema": "dc"}, {"key": "dc.identifier.uri", "value": "https://jyx.jyu.fi/handle/123456789/76696", "language": null, "element": "identifier", "qualifier": "uri", "schema": "dc"}, {"key": "dc.description.abstract", "value": "T\u00e4ss\u00e4 tutkimuksessa etsittiin puhekomennontunnistusmallia, joka voitaisiin kouluttaa pienell\u00e4 m\u00e4\u00e4r\u00e4ll\u00e4 \u00e4\u00e4nitteit\u00e4 tunnistamaan muutamia ennalta m\u00e4\u00e4r\u00e4ttyj\u00e4 tietyn henkil\u00f6n komentoja. Kolmea puhujariippuvaisella datalla koulutettua neuroverkkomallia vertailtiin muun muassa tunnistustarkkuuden ja tunnistusnopeuden suhteen. Tutkimuksessa parhaitenkin suoriutunut malli todettiin liian ep\u00e4luotettavaksi k\u00e4yt\u00e4nn\u00f6n k\u00e4ytt\u00f6\u00f6n. Mahdollisiksi tavoiksi parantaa mallia esitettiin muuksi kuin komennoiksi luokiteltavan \u00e4\u00e4nidatan ottaminen osaksi koulutusta ja data-augmentaatio.", "language": "fi", "element": "description", "qualifier": "abstract", "schema": "dc"}, {"key": "dc.description.abstract", "value": "In this study a speech command recognition model that could be trained with small amount of data to recognize a few predefined commands spoken by a specific person was sought. Three neural network models trained with speaker-dependent data were compared by their recognition accuracy and inference speed among other metrics. Even the best performing model of the study was deemed to be unsuitable for practical application. Integrating non-command speech data into the training process and data-augmentation were brought up as possible ways to improve the model's performance.", "language": "en", "element": "description", "qualifier": "abstract", "schema": "dc"}, {"key": "dc.description.provenance", "value": "Submitted by Paivi Vuorio (paelvuor@jyu.fi) on 2021-06-18T07:42:14Z\nNo. of bitstreams: 0", "language": "en", "element": "description", "qualifier": "provenance", "schema": "dc"}, {"key": "dc.description.provenance", "value": "Made available in DSpace on 2021-06-18T07:42:14Z (GMT). No. of bitstreams: 0\n Previous issue date: 2021", "language": "en", "element": "description", "qualifier": "provenance", "schema": "dc"}, {"key": "dc.format.extent", "value": "54", "language": "", "element": "format", "qualifier": "extent", "schema": "dc"}, {"key": "dc.format.mimetype", "value": "application/pdf", "language": null, "element": "format", "qualifier": "mimetype", "schema": "dc"}, {"key": "dc.language.iso", "value": "fin", "language": null, "element": "language", "qualifier": "iso", "schema": "dc"}, {"key": "dc.rights", "value": "In Copyright", "language": "en", "element": "rights", "qualifier": null, "schema": "dc"}, {"key": "dc.title", "value": "Puhujariippuvainen puhekomentojentunnistus neuroverkoilla", "language": "", "element": "title", "qualifier": null, "schema": "dc"}, {"key": "dc.type", "value": "master thesis", "language": null, "element": "type", "qualifier": null, "schema": "dc"}, {"key": "dc.identifier.urn", "value": "URN:NBN:fi:jyu-202106183892", "language": "", "element": "identifier", "qualifier": "urn", "schema": "dc"}, {"key": "dc.type.ontasot", "value": "Pro gradu -tutkielma", "language": "fi", "element": "type", "qualifier": "ontasot", "schema": "dc"}, {"key": "dc.type.ontasot", "value": "Master\u2019s thesis", "language": "en", "element": "type", "qualifier": "ontasot", "schema": "dc"}, {"key": "dc.contributor.faculty", "value": "Informaatioteknologian tiedekunta", "language": "fi", "element": "contributor", "qualifier": "faculty", "schema": "dc"}, {"key": "dc.contributor.faculty", "value": "Faculty of Information Technology", "language": "en", "element": "contributor", "qualifier": "faculty", "schema": "dc"}, {"key": "dc.contributor.department", "value": "Informaatioteknologia", "language": "fi", "element": "contributor", "qualifier": "department", "schema": "dc"}, {"key": "dc.contributor.department", "value": "Information Technology", "language": "en", "element": "contributor", "qualifier": "department", "schema": "dc"}, {"key": "dc.contributor.organization", "value": "Jyv\u00e4skyl\u00e4n yliopisto", "language": "fi", "element": "contributor", "qualifier": "organization", "schema": "dc"}, {"key": "dc.contributor.organization", "value": "University of Jyv\u00e4skyl\u00e4", "language": "en", "element": "contributor", "qualifier": "organization", "schema": "dc"}, {"key": "dc.subject.discipline", "value": "Tietotekniikka", "language": "fi", "element": "subject", "qualifier": "discipline", "schema": "dc"}, {"key": "dc.subject.discipline", "value": "Mathematical Information Technology", "language": "en", "element": "subject", "qualifier": "discipline", "schema": "dc"}, {"key": "yvv.contractresearch.funding", "value": "0", "language": "", "element": "contractresearch", "qualifier": "funding", "schema": "yvv"}, {"key": "dc.type.coar", "value": "http://purl.org/coar/resource_type/c_bdcc", "language": null, "element": "type", "qualifier": "coar", "schema": "dc"}, {"key": "dc.rights.accesslevel", "value": "openAccess", "language": null, "element": "rights", "qualifier": "accesslevel", "schema": "dc"}, {"key": "dc.type.publication", "value": "masterThesis", "language": null, "element": "type", "qualifier": "publication", "schema": "dc"}, {"key": "dc.subject.oppiainekoodi", "value": "602", "language": "", "element": "subject", "qualifier": "oppiainekoodi", "schema": "dc"}, {"key": "dc.subject.yso", "value": "neuroverkot", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "puheentunnistus", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "matemaattiset mallit", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.format.content", "value": "fulltext", "language": null, "element": "format", "qualifier": "content", "schema": "dc"}, {"key": "dc.rights.url", "value": "https://rightsstatements.org/page/InC/1.0/", "language": null, "element": "rights", "qualifier": "url", "schema": "dc"}, {"key": "dc.type.okm", "value": "G2", "language": null, "element": "type", "qualifier": "okm", "schema": "dc"}]
id jyx.123456789_76696
language fin
last_indexed 2025-03-31T20:02:22Z
main_date 2021-01-01T00:00:00Z
main_date_str 2021
online_boolean 1
online_urls_str_mv {"url":"https:\/\/jyx.jyu.fi\/bitstreams\/4f484235-d805-4bc7-a67c-4ebdcc0287b6\/download","text":"URN:NBN:fi:jyu-202106183892.pdf","source":"jyx","mediaType":"application\/pdf"}
publishDate 2021
record_format qdc
source_str_mv jyx
spellingShingle Nummelin, Panu Puhujariippuvainen puhekomentojentunnistus neuroverkoilla Tietotekniikka Mathematical Information Technology 602 neuroverkot puheentunnistus matemaattiset mallit
title Puhujariippuvainen puhekomentojentunnistus neuroverkoilla
title_full Puhujariippuvainen puhekomentojentunnistus neuroverkoilla
title_fullStr Puhujariippuvainen puhekomentojentunnistus neuroverkoilla Puhujariippuvainen puhekomentojentunnistus neuroverkoilla
title_full_unstemmed Puhujariippuvainen puhekomentojentunnistus neuroverkoilla Puhujariippuvainen puhekomentojentunnistus neuroverkoilla
title_short Puhujariippuvainen puhekomentojentunnistus neuroverkoilla
title_sort puhujariippuvainen puhekomentojentunnistus neuroverkoilla
title_txtP Puhujariippuvainen puhekomentojentunnistus neuroverkoilla
topic Tietotekniikka Mathematical Information Technology 602 neuroverkot puheentunnistus matemaattiset mallit
topic_facet 602 Mathematical Information Technology Tietotekniikka matemaattiset mallit neuroverkot puheentunnistus
url https://jyx.jyu.fi/handle/123456789/76696 http://www.urn.fi/URN:NBN:fi:jyu-202106183892
work_keys_str_mv AT nummelinpanu puhujariippuvainenpuhekomentojentunnistusneuroverkoilla