Semi-supervised deep learning for the classification of eldercare workers’ sentiments

Tekstin luokitteluun on olemassa laaja tutkimuksen kirjo, mutta vain osa siitä on puoliohjattujen syvien neuroverkkojen pohjalta tehtyä – etenkin, kun opetusaineisto on ollut englannin kielellä, tai muulla huomattavan paljon tutkitulla kielellä. Tässä pro gradussa käymme läpi puoliohjattujen syväopp...

Full description

Bibliographic Details
Main Author: Toivanen, Ida
Other Authors: Informaatioteknologian tiedekunta, Faculty of Information Technology, Informaatioteknologia, Information Technology, Jyväskylän yliopisto, University of Jyväskylä
Format: Master's thesis
Language:eng
Published: 2022
Subjects:
Online Access: https://jyx.jyu.fi/handle/123456789/81526
_version_ 1826225741953499136
author Toivanen, Ida
author2 Informaatioteknologian tiedekunta Faculty of Information Technology Informaatioteknologia Information Technology Jyväskylän yliopisto University of Jyväskylä
author_facet Toivanen, Ida Informaatioteknologian tiedekunta Faculty of Information Technology Informaatioteknologia Information Technology Jyväskylän yliopisto University of Jyväskylä Toivanen, Ida Informaatioteknologian tiedekunta Faculty of Information Technology Informaatioteknologia Information Technology Jyväskylän yliopisto University of Jyväskylä
author_sort Toivanen, Ida
datasource_str_mv jyx
description Tekstin luokitteluun on olemassa laaja tutkimuksen kirjo, mutta vain osa siitä on puoliohjattujen syvien neuroverkkojen pohjalta tehtyä – etenkin, kun opetusaineisto on ollut englannin kielellä, tai muulla huomattavan paljon tutkitulla kielellä. Tässä pro gradussa käymme läpi puoliohjattujen syväoppimismenetelmien kirjallisuutta tekstin luokittelussa, ja luomme käytännön toteutuksen kolmelle puoliohjatulle tekstin luokittelumenetelmälle. Nämä menetelmät opetetaan ja testataan pienenpuoleisella, suomenkielisellä aineistolla. Tulosten perusteella voitaisiin sanoa, että puoliohjattujen menetelmien yhteydessä on kannattavaa käyttää regularisointimenetelmiä ylisovittumisen ehkäisemiseksi, varsinkin kun opetusaineisto on pieni. Jotta voitaisiin saada kokonaisvaltaisempi kuva eri puoliohjattujen menetelmien kannattavuudesta ja luotettavuudesta luonnollisen kielen luokittelutehtävässä, olisi suomenkielisistä syväoppimismalleista ja regularisoinnista hyvä tehdä lisää tutkimusta. There exists extensive research for text classification, but only a handful of it is put into practice by deep neural networks that use semi-supervised learning – especially when semi-supervised deep neural networks are not trained in English, or other majorly studied languages. In this thesis we go through previous literature regarding semi-supervised deep learning methods for text classification, and then build a hands-on solution for three semi-supervised text classification methods. These methods are trained and tested on a small dataset, that is in Finnish. The results suggest that regularization methods should be taken into consideration when using semi-supervised methods for training – particularly when using smaller datasets that easily leads to overfitting. More research on regularization and Finnish deep learning models should be conducted to have a more comprehensive view on the applicability and reliability of text classification in natural language processing.
first_indexed 2024-09-11T08:50:45Z
format Pro gradu
free_online_boolean 1
fullrecord [{"key": "dc.contributor.advisor", "value": "\u00c4yr\u00e4m\u00f6, Sami", "language": "", "element": "contributor", "qualifier": "advisor", "schema": "dc"}, {"key": "dc.contributor.advisor", "value": "Jauhiainen, Susanne", "language": "", "element": "contributor", "qualifier": "advisor", "schema": "dc"}, {"key": "dc.contributor.author", "value": "Toivanen, Ida", "language": "", "element": "contributor", "qualifier": "author", "schema": "dc"}, {"key": "dc.date.accessioned", "value": "2022-06-07T10:17:16Z", "language": null, "element": "date", "qualifier": "accessioned", "schema": "dc"}, {"key": "dc.date.available", "value": "2022-06-07T10:17:16Z", "language": null, "element": "date", "qualifier": "available", "schema": "dc"}, {"key": "dc.date.issued", "value": "2022", "language": "", "element": "date", "qualifier": "issued", "schema": "dc"}, {"key": "dc.identifier.uri", "value": "https://jyx.jyu.fi/handle/123456789/81526", "language": null, "element": "identifier", "qualifier": "uri", "schema": "dc"}, {"key": "dc.description.abstract", "value": "Tekstin luokitteluun on olemassa laaja tutkimuksen kirjo, mutta vain osa siit\u00e4 on puoliohjattujen syvien neuroverkkojen pohjalta tehty\u00e4 \u2013 etenkin, kun opetusaineisto on ollut englannin kielell\u00e4, tai muulla huomattavan paljon tutkitulla kielell\u00e4. T\u00e4ss\u00e4 pro gradussa k\u00e4ymme l\u00e4pi puoliohjattujen syv\u00e4oppimismenetelmien kirjallisuutta tekstin luokittelussa, ja luomme k\u00e4yt\u00e4nn\u00f6n toteutuksen kolmelle puoliohjatulle tekstin luokittelumenetelm\u00e4lle. N\u00e4m\u00e4 menetelm\u00e4t opetetaan ja testataan pienenpuoleisella, suomenkielisell\u00e4 aineistolla. Tulosten perusteella voitaisiin sanoa, ett\u00e4 puoliohjattujen menetelmien yhteydess\u00e4 on kannattavaa k\u00e4ytt\u00e4\u00e4 regularisointimenetelmi\u00e4 ylisovittumisen ehk\u00e4isemiseksi, varsinkin kun opetusaineisto on pieni. Jotta voitaisiin saada kokonaisvaltaisempi kuva eri puoliohjattujen menetelmien kannattavuudesta ja luotettavuudesta luonnollisen kielen luokitteluteht\u00e4v\u00e4ss\u00e4, olisi suomenkielisist\u00e4 syv\u00e4oppimismalleista ja regularisoinnista hyv\u00e4 tehd\u00e4 lis\u00e4\u00e4 tutkimusta.", "language": "fi", "element": "description", "qualifier": "abstract", "schema": "dc"}, {"key": "dc.description.abstract", "value": "There exists extensive research for text classification, but only a handful of it is put into practice by deep neural networks that use semi-supervised learning \u2013 especially when semi-supervised deep neural networks are not trained in English, or other majorly studied languages. In this thesis we go through previous literature regarding semi-supervised deep learning methods for text classification, and then build a hands-on solution for three semi-supervised text classification methods. These methods are trained and tested on a small dataset, that is in Finnish. The results suggest that regularization methods should be taken into consideration when using semi-supervised methods for training \u2013 particularly when using smaller datasets that easily leads to overfitting. More research on regularization and Finnish deep learning models should be conducted to have a more comprehensive view on the applicability and reliability of text classification in natural language processing.", "language": "en", "element": "description", "qualifier": "abstract", "schema": "dc"}, {"key": "dc.description.provenance", "value": "Submitted by Miia Hakanen (mihakane@jyu.fi) on 2022-06-07T10:17:16Z\nNo. of bitstreams: 0", "language": "en", "element": "description", "qualifier": "provenance", "schema": "dc"}, {"key": "dc.description.provenance", "value": "Made available in DSpace on 2022-06-07T10:17:16Z (GMT). No. of bitstreams: 0\n Previous issue date: 2022", "language": "en", "element": "description", "qualifier": "provenance", "schema": "dc"}, {"key": "dc.format.extent", "value": "63", "language": "", "element": "format", "qualifier": "extent", "schema": "dc"}, {"key": "dc.format.mimetype", "value": "application/pdf", "language": null, "element": "format", "qualifier": "mimetype", "schema": "dc"}, {"key": "dc.language.iso", "value": "eng", "language": null, "element": "language", "qualifier": "iso", "schema": "dc"}, {"key": "dc.rights", "value": "In Copyright", "language": "en", "element": "rights", "qualifier": null, "schema": "dc"}, {"key": "dc.subject.other", "value": "natural language processing", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.subject.other", "value": "text classification", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.subject.other", "value": "semi-supervised learning", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.title", "value": "Semi-supervised deep learning for the classification of eldercare workers\u2019 sentiments", "language": "", "element": "title", "qualifier": null, "schema": "dc"}, {"key": "dc.type", "value": "master thesis", "language": null, "element": "type", "qualifier": null, "schema": "dc"}, {"key": "dc.identifier.urn", "value": "URN:NBN:fi:jyu-202206073142", "language": "", "element": "identifier", "qualifier": "urn", "schema": "dc"}, {"key": "dc.type.ontasot", "value": "Pro gradu -tutkielma", "language": "fi", "element": "type", "qualifier": "ontasot", "schema": "dc"}, {"key": "dc.type.ontasot", "value": "Master\u2019s thesis", "language": "en", "element": "type", "qualifier": "ontasot", "schema": "dc"}, {"key": "dc.contributor.faculty", "value": "Informaatioteknologian tiedekunta", "language": "fi", "element": "contributor", "qualifier": "faculty", "schema": "dc"}, {"key": "dc.contributor.faculty", "value": "Faculty of Information Technology", "language": "en", "element": "contributor", "qualifier": "faculty", "schema": "dc"}, {"key": "dc.contributor.department", "value": "Informaatioteknologia", "language": "fi", "element": "contributor", "qualifier": "department", "schema": "dc"}, {"key": "dc.contributor.department", "value": "Information Technology", "language": "en", "element": "contributor", "qualifier": "department", "schema": "dc"}, {"key": "dc.contributor.organization", "value": "Jyv\u00e4skyl\u00e4n yliopisto", "language": "fi", "element": "contributor", "qualifier": "organization", "schema": "dc"}, {"key": "dc.contributor.organization", "value": "University of Jyv\u00e4skyl\u00e4", "language": "en", "element": "contributor", "qualifier": "organization", "schema": "dc"}, {"key": "dc.subject.discipline", "value": "Tietotekniikka", "language": "fi", "element": "subject", "qualifier": "discipline", "schema": "dc"}, {"key": "dc.subject.discipline", "value": "Mathematical Information Technology", "language": "en", "element": "subject", "qualifier": "discipline", "schema": "dc"}, {"key": "yvv.contractresearch.funding", "value": "0", "language": "", "element": "contractresearch", "qualifier": "funding", "schema": "yvv"}, {"key": "dc.type.coar", "value": "http://purl.org/coar/resource_type/c_bdcc", "language": null, "element": "type", "qualifier": "coar", "schema": "dc"}, {"key": "dc.rights.accesslevel", "value": "openAccess", "language": null, "element": "rights", "qualifier": "accesslevel", "schema": "dc"}, {"key": "dc.type.publication", "value": "masterThesis", "language": null, "element": "type", "qualifier": "publication", "schema": "dc"}, {"key": "dc.subject.oppiainekoodi", "value": "602", "language": "", "element": "subject", "qualifier": "oppiainekoodi", "schema": "dc"}, {"key": "dc.subject.yso", "value": "koneoppiminen", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "syv\u00e4oppiminen", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "machine learning", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "deep learning", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.format.content", "value": "fulltext", "language": null, "element": "format", "qualifier": "content", "schema": "dc"}, {"key": "dc.rights.url", "value": "https://rightsstatements.org/page/InC/1.0/", "language": null, "element": "rights", "qualifier": "url", "schema": "dc"}, {"key": "dc.type.okm", "value": "G2", "language": null, "element": "type", "qualifier": "okm", "schema": "dc"}]
id jyx.123456789_81526
language eng
last_indexed 2025-02-18T10:56:32Z
main_date 2022-01-01T00:00:00Z
main_date_str 2022
online_boolean 1
online_urls_str_mv {"url":"https:\/\/jyx.jyu.fi\/bitstreams\/1f4c430b-7d5e-40bc-8f69-ce40ccaa01cd\/download","text":"URN:NBN:fi:jyu-202206073142.pdf","source":"jyx","mediaType":"application\/pdf"}
publishDate 2022
record_format qdc
source_str_mv jyx
spellingShingle Toivanen, Ida Semi-supervised deep learning for the classification of eldercare workers’ sentiments natural language processing text classification semi-supervised learning Tietotekniikka Mathematical Information Technology 602 koneoppiminen syväoppiminen machine learning deep learning
title Semi-supervised deep learning for the classification of eldercare workers’ sentiments
title_full Semi-supervised deep learning for the classification of eldercare workers’ sentiments
title_fullStr Semi-supervised deep learning for the classification of eldercare workers’ sentiments Semi-supervised deep learning for the classification of eldercare workers’ sentiments
title_full_unstemmed Semi-supervised deep learning for the classification of eldercare workers’ sentiments Semi-supervised deep learning for the classification of eldercare workers’ sentiments
title_short Semi-supervised deep learning for the classification of eldercare workers’ sentiments
title_sort semi supervised deep learning for the classification of eldercare workers sentiments
title_txtP Semi-supervised deep learning for the classification of eldercare workers’ sentiments
topic natural language processing text classification semi-supervised learning Tietotekniikka Mathematical Information Technology 602 koneoppiminen syväoppiminen machine learning deep learning
topic_facet 602 Mathematical Information Technology Tietotekniikka deep learning koneoppiminen machine learning natural language processing semi-supervised learning syväoppiminen text classification
url https://jyx.jyu.fi/handle/123456789/81526 http://www.urn.fi/URN:NBN:fi:jyu-202206073142
work_keys_str_mv AT toivanenida semisuperviseddeeplearningfortheclassificationofeldercareworkerssentiments