Voiko vähästä oppia koneoppimisen haasteet pienellä aineistolla

Tämä kandidaatintutkielma käsittelee koneoppimista pienellä aineistolla. Koneoppimisessa kone parantaa suorituskykyään jonkin tietyn tehtävän ratkaisemiseksi itsenäisesti sitä mukaa kun lisää kokemusta tai dataa kertyy. Koneoppimisongelmat voidaan jakaa luokittelu- ja regressio-ongelmiin. Yleensä ko...

Full description

Bibliographic Details
Main Author: Kauppinen, Jussi
Other Authors: Informaatioteknologian tiedekunta, Faculty of Information Technology, Informaatioteknologia, Information Technology, Jyväskylän yliopisto, University of Jyväskylä
Format: Bachelor's thesis
Language:fin
Published: 2019
Subjects:
Online Access: https://jyx.jyu.fi/handle/123456789/64021
_version_ 1826225812662124544
author Kauppinen, Jussi
author2 Informaatioteknologian tiedekunta Faculty of Information Technology Informaatioteknologia Information Technology Jyväskylän yliopisto University of Jyväskylä
author_facet Kauppinen, Jussi Informaatioteknologian tiedekunta Faculty of Information Technology Informaatioteknologia Information Technology Jyväskylän yliopisto University of Jyväskylä Kauppinen, Jussi Informaatioteknologian tiedekunta Faculty of Information Technology Informaatioteknologia Information Technology Jyväskylän yliopisto University of Jyväskylä
author_sort Kauppinen, Jussi
datasource_str_mv jyx
description Tämä kandidaatintutkielma käsittelee koneoppimista pienellä aineistolla. Koneoppimisessa kone parantaa suorituskykyään jonkin tietyn tehtävän ratkaisemiseksi itsenäisesti sitä mukaa kun lisää kokemusta tai dataa kertyy. Koneoppimisongelmat voidaan jakaa luokittelu- ja regressio-ongelmiin. Yleensä koneoppimistehtävät vaativat ison aineiston tarkan koneoppimismallin opettamiseksi, mutta usein kattavan aineiston hankkiminen muodostuu ongelmaksi. Tämän tutkielman tavoitteena on käydä läpi minkälaisia ongelmia koneoppimismallin opetuksessa ilmenee kun käytettävissä on pieni aineisto ja esitellä ratkaisuja näihin ongelmiin. Tutkielma tehtiin kirjallisuuskatsauksena. Tutkitut julkaisut käsittelivät edellä mainittuja ongelmia, sekä niihin kehiteltyjä ratkaisuja. Tutkielmassa selvisi, että pienellä aineistolla on haastavampaa opettaa hyvin yleistyvää koneoppimismallia, ja ylisovittumisen välttäminen on vaikeaa. Yleistymisen parantamiseksi esitellään keinotekoista lisädataa generoiva SMOTE-tekniikka, ja ylisovittumista yritetään saada kuriin regularisoinnin avulla This bachelor’s thesis deals with machine learning with little data. In machine learning, the machine improves its performance to solve a specific task independently as more experience or data accumulates. Machine learning problems can be divided into classification and regression problems. Usually, machine learning tasks require large data to train an accurate machine learning model, but often obtaining large enough data is problematic. The aim of this thesis is to review the problems encountered in training a machine learning model when there is only little data available and solutions to these problems. The thesis was made as a literature review. The publications examined deal with the above-mentioned problems, as well as the solutions developed for them. In the thesis it became clear that it is more challenging to teach a machine learning model that generalizes well with little material, and it is difficult to avoid overfitting. In order to generalize better, we examine SMOTE technology to generate synthetic data and to prevent overfitting we talk about regularization.
first_indexed 2019-09-20T09:13:09Z
format Kandityö
free_online_boolean 1
fullrecord [{"key": "dc.contributor.author", "value": "Kauppinen, Jussi", "language": "", "element": "contributor", "qualifier": "author", "schema": "dc"}, {"key": "dc.date.accessioned", "value": "2019-05-17T06:54:29Z", "language": null, "element": "date", "qualifier": "accessioned", "schema": "dc"}, {"key": "dc.date.available", "value": "2019-05-17T06:54:29Z", "language": null, "element": "date", "qualifier": "available", "schema": "dc"}, {"key": "dc.date.issued", "value": "2019", "language": "", "element": "date", "qualifier": "issued", "schema": "dc"}, {"key": "dc.identifier.uri", "value": "https://jyx.jyu.fi/handle/123456789/64021", "language": null, "element": "identifier", "qualifier": "uri", "schema": "dc"}, {"key": "dc.description.abstract", "value": "T\u00e4m\u00e4 kandidaatintutkielma k\u00e4sittelee koneoppimista pienell\u00e4 aineistolla. Koneoppimisessa kone parantaa suorituskyky\u00e4\u00e4n jonkin tietyn teht\u00e4v\u00e4n ratkaisemiseksi itsen\u00e4isesti sit\u00e4 mukaa kun lis\u00e4\u00e4 kokemusta tai dataa kertyy. Koneoppimisongelmat voidaan jakaa luokittelu- ja regressio-ongelmiin. Yleens\u00e4 koneoppimisteht\u00e4v\u00e4t vaativat ison aineiston tarkan koneoppimismallin opettamiseksi, mutta usein kattavan aineiston hankkiminen muodostuu ongelmaksi. T\u00e4m\u00e4n tutkielman tavoitteena on k\u00e4yd\u00e4 l\u00e4pi mink\u00e4laisia ongelmia koneoppimismallin opetuksessa ilmenee kun k\u00e4ytett\u00e4viss\u00e4 on pieni aineisto ja esitell\u00e4 ratkaisuja n\u00e4ihin ongelmiin. Tutkielma tehtiin kirjallisuuskatsauksena. Tutkitut julkaisut k\u00e4sitteliv\u00e4t edell\u00e4 mainittuja ongelmia, sek\u00e4 niihin kehiteltyj\u00e4 ratkaisuja. Tutkielmassa selvisi, ett\u00e4 pienell\u00e4 aineistolla on haastavampaa opettaa hyvin yleistyv\u00e4\u00e4 koneoppimismallia, ja ylisovittumisen v\u00e4ltt\u00e4minen on vaikeaa. Yleistymisen parantamiseksi esitell\u00e4\u00e4n keinotekoista lis\u00e4dataa generoiva SMOTE-tekniikka, ja ylisovittumista yritet\u00e4\u00e4n saada kuriin regularisoinnin avulla", "language": "fi", "element": "description", "qualifier": "abstract", "schema": "dc"}, {"key": "dc.description.abstract", "value": "This bachelor\u2019s thesis deals with machine learning with little data. In machine\nlearning, the machine improves its performance to solve a specific task independently as\nmore experience or data accumulates. Machine learning problems can be divided into classification and regression problems. Usually, machine learning tasks require large data to train an accurate machine learning model, but often obtaining large enough data is problematic. The aim of this thesis is to review the problems encountered in training a machine learning model when there is only little data available and solutions to these problems. The thesis was made as a literature review. The publications examined deal with the above-mentioned problems, as well as the solutions developed for them. In the thesis it became clear that it is more challenging to teach a machine learning model that generalizes well with little material, and it is difficult to avoid overfitting. In order to generalize better, we examine SMOTE technology to generate synthetic data and to prevent overfitting we talk about regularization.", "language": "en", "element": "description", "qualifier": "abstract", "schema": "dc"}, {"key": "dc.description.provenance", "value": "Submitted by Paivi Vuorio (paelvuor@jyu.fi) on 2019-05-17T06:54:29Z\nNo. of bitstreams: 0", "language": "en", "element": "description", "qualifier": "provenance", "schema": "dc"}, {"key": "dc.description.provenance", "value": "Made available in DSpace on 2019-05-17T06:54:29Z (GMT). No. of bitstreams: 0\n Previous issue date: 2019", "language": "en", "element": "description", "qualifier": "provenance", "schema": "dc"}, {"key": "dc.format.extent", "value": "20", "language": "", "element": "format", "qualifier": "extent", "schema": "dc"}, {"key": "dc.language.iso", "value": "fin", "language": null, "element": "language", "qualifier": "iso", "schema": "dc"}, {"key": "dc.rights", "value": "In Copyright", "language": "en", "element": "rights", "qualifier": null, "schema": "dc"}, {"key": "dc.subject.other", "value": "luokittelu", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.subject.other", "value": "pieni data", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.subject.other", "value": "pieni aineisto", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.subject.other", "value": "regularisointi", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.title", "value": "Voiko v\u00e4h\u00e4st\u00e4 oppia : koneoppimisen haasteet pienell\u00e4 aineistolla", "language": "", "element": "title", "qualifier": null, "schema": "dc"}, {"key": "dc.type", "value": "bachelor thesis", "language": null, "element": "type", "qualifier": null, "schema": "dc"}, {"key": "dc.identifier.urn", "value": "URN:NBN:fi:jyu-201905172650", "language": "", "element": "identifier", "qualifier": "urn", "schema": "dc"}, {"key": "dc.type.ontasot", "value": "Bachelor's thesis", "language": "en", "element": "type", "qualifier": "ontasot", "schema": "dc"}, {"key": "dc.type.ontasot", "value": "Kandidaatinty\u00f6", "language": "fi", "element": "type", "qualifier": "ontasot", "schema": "dc"}, {"key": "dc.contributor.faculty", "value": "Informaatioteknologian tiedekunta", "language": "fi", "element": "contributor", "qualifier": "faculty", "schema": "dc"}, {"key": "dc.contributor.faculty", "value": "Faculty of Information Technology", "language": "en", "element": "contributor", "qualifier": "faculty", "schema": "dc"}, {"key": "dc.contributor.department", "value": "Informaatioteknologia", "language": "fi", "element": "contributor", "qualifier": "department", "schema": "dc"}, {"key": "dc.contributor.department", "value": "Information Technology", "language": "en", "element": "contributor", "qualifier": "department", "schema": "dc"}, {"key": "dc.contributor.organization", "value": "Jyv\u00e4skyl\u00e4n yliopisto", "language": "fi", "element": "contributor", "qualifier": "organization", "schema": "dc"}, {"key": "dc.contributor.organization", "value": "University of Jyv\u00e4skyl\u00e4", "language": "en", "element": "contributor", "qualifier": "organization", "schema": "dc"}, {"key": "dc.subject.discipline", "value": "Tietotekniikka", "language": "fi", "element": "subject", "qualifier": "discipline", "schema": "dc"}, {"key": "dc.subject.discipline", "value": "Mathematical Information Technology", "language": "en", "element": "subject", "qualifier": "discipline", "schema": "dc"}, {"key": "yvv.contractresearch.funding", "value": "0", "language": "", "element": "contractresearch", "qualifier": "funding", "schema": "yvv"}, {"key": "dc.type.coar", "value": "http://purl.org/coar/resource_type/c_7a1f", "language": null, "element": "type", "qualifier": "coar", "schema": "dc"}, {"key": "dc.rights.accesslevel", "value": "openAccess", "language": null, "element": "rights", "qualifier": "accesslevel", "schema": "dc"}, {"key": "dc.type.publication", "value": "bachelorThesis", "language": null, "element": "type", "qualifier": "publication", "schema": "dc"}, {"key": "dc.subject.oppiainekoodi", "value": "602", "language": "", "element": "subject", "qualifier": "oppiainekoodi", "schema": "dc"}, {"key": "dc.subject.yso", "value": "koneoppiminen", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.rights.url", "value": "https://rightsstatements.org/page/InC/1.0/", "language": null, "element": "rights", "qualifier": "url", "schema": "dc"}]
id jyx.123456789_64021
language fin
last_indexed 2025-02-18T10:54:33Z
main_date 2019-01-01T00:00:00Z
main_date_str 2019
online_boolean 1
online_urls_str_mv {"url":"https:\/\/jyx.jyu.fi\/bitstreams\/fdd4ea22-a590-4b7a-8594-a491c198fee4\/download","text":"URN:NBN:fi:jyu-201905172650.pdf","source":"jyx","mediaType":"application\/pdf"}
publishDate 2019
record_format qdc
source_str_mv jyx
spellingShingle Kauppinen, Jussi Voiko vähästä oppia : koneoppimisen haasteet pienellä aineistolla luokittelu pieni data pieni aineisto regularisointi Tietotekniikka Mathematical Information Technology 602 koneoppiminen
title Voiko vähästä oppia : koneoppimisen haasteet pienellä aineistolla
title_full Voiko vähästä oppia : koneoppimisen haasteet pienellä aineistolla
title_fullStr Voiko vähästä oppia : koneoppimisen haasteet pienellä aineistolla Voiko vähästä oppia : koneoppimisen haasteet pienellä aineistolla
title_full_unstemmed Voiko vähästä oppia : koneoppimisen haasteet pienellä aineistolla Voiko vähästä oppia : koneoppimisen haasteet pienellä aineistolla
title_short Voiko vähästä oppia
title_sort voiko vähästä oppia koneoppimisen haasteet pienellä aineistolla
title_sub koneoppimisen haasteet pienellä aineistolla
title_txtP Voiko vähästä oppia : koneoppimisen haasteet pienellä aineistolla
topic luokittelu pieni data pieni aineisto regularisointi Tietotekniikka Mathematical Information Technology 602 koneoppiminen
topic_facet 602 Mathematical Information Technology Tietotekniikka koneoppiminen luokittelu pieni aineisto pieni data regularisointi
url https://jyx.jyu.fi/handle/123456789/64021 http://www.urn.fi/URN:NBN:fi:jyu-201905172650
work_keys_str_mv AT kauppinenjussi voikovähästäoppiakoneoppimisenhaasteetpienelläaineistolla