MuZero ja mallipohjainen vahvistusoppiminen

Tutkielmassa pyritään selvittämään, mitä mallipohjainen vahvistusoppiminen tarkoittaa, ja kuinka sitä hyödynnetään MuZero-nimisen tekoälyn algoritmissa. MuZeroa on testattu menestyksekkäästi sekä klassisissa lautapeleissä, että visuaalisesti monimutkaisissa Atari –peleissä. MuZero yhdistää toiminnas...

Full description

Bibliographic Details
Main Author: Leinonen, Hertta
Other Authors: Informaatioteknologian tiedekunta, Faculty of Information Technology, Informaatioteknologia, Information Technology, Jyväskylän yliopisto, University of Jyväskylä
Format: Bachelor's thesis
Language:fin
Published: 2021
Subjects:
Online Access: https://jyx.jyu.fi/handle/123456789/75464
_version_ 1826225797762908160
author Leinonen, Hertta
author2 Informaatioteknologian tiedekunta Faculty of Information Technology Informaatioteknologia Information Technology Jyväskylän yliopisto University of Jyväskylä
author_facet Leinonen, Hertta Informaatioteknologian tiedekunta Faculty of Information Technology Informaatioteknologia Information Technology Jyväskylän yliopisto University of Jyväskylä Leinonen, Hertta Informaatioteknologian tiedekunta Faculty of Information Technology Informaatioteknologia Information Technology Jyväskylän yliopisto University of Jyväskylä
author_sort Leinonen, Hertta
datasource_str_mv jyx
description Tutkielmassa pyritään selvittämään, mitä mallipohjainen vahvistusoppiminen tarkoittaa, ja kuinka sitä hyödynnetään MuZero-nimisen tekoälyn algoritmissa. MuZeroa on testattu menestyksekkäästi sekä klassisissa lautapeleissä, että visuaalisesti monimutkaisissa Atari –peleissä. MuZero yhdistää toiminnassaan syvän mallipohjaisen vahvistusoppimisen, sekä Monte Carlo -puuhaun, saavuttaen kyvyn suoriutua keskenään hyvin erilaisista peleistä tuntematta niiden sääntöjä entuudestaan. The aim of this thesis is to find out what model-based reinforcement learning is and how it is utilized in MuZero’s algorithm. MuZero has been successfully tested in both classic board games and visually complex Atari games. MuZero combines deep model-based reinforcement learning with Monte Carlo tree search, achieving the ability to play different games without knowing their rules.
first_indexed 2021-05-12T20:02:59Z
format Kandityö
free_online_boolean 1
fullrecord [{"key": "dc.contributor.advisor", "value": "Annala, Leevi", "language": "", "element": "contributor", "qualifier": "advisor", "schema": "dc"}, {"key": "dc.contributor.author", "value": "Leinonen, Hertta", "language": "", "element": "contributor", "qualifier": "author", "schema": "dc"}, {"key": "dc.date.accessioned", "value": "2021-05-12T05:51:42Z", "language": null, "element": "date", "qualifier": "accessioned", "schema": "dc"}, {"key": "dc.date.available", "value": "2021-05-12T05:51:42Z", "language": null, "element": "date", "qualifier": "available", "schema": "dc"}, {"key": "dc.date.issued", "value": "2021", "language": "", "element": "date", "qualifier": "issued", "schema": "dc"}, {"key": "dc.identifier.uri", "value": "https://jyx.jyu.fi/handle/123456789/75464", "language": null, "element": "identifier", "qualifier": "uri", "schema": "dc"}, {"key": "dc.description.abstract", "value": "Tutkielmassa pyrit\u00e4\u00e4n selvitt\u00e4m\u00e4\u00e4n, mit\u00e4 mallipohjainen vahvistusoppiminen tarkoittaa, ja kuinka sit\u00e4 hy\u00f6dynnet\u00e4\u00e4n MuZero-nimisen teko\u00e4lyn algoritmissa. MuZeroa on testattu menestyksekk\u00e4\u00e4sti sek\u00e4 klassisissa lautapeleiss\u00e4, ett\u00e4 visuaalisesti monimutkaisissa Atari \u2013peleiss\u00e4. MuZero yhdist\u00e4\u00e4 toiminnassaan syv\u00e4n mallipohjaisen vahvistusoppimisen, sek\u00e4 Monte Carlo -puuhaun, saavuttaen kyvyn suoriutua kesken\u00e4\u00e4n hyvin erilaisista peleist\u00e4 tuntematta niiden s\u00e4\u00e4nt\u00f6j\u00e4 entuudestaan.", "language": "fi", "element": "description", "qualifier": "abstract", "schema": "dc"}, {"key": "dc.description.abstract", "value": "The aim of this thesis is to find out what model-based reinforcement learning is and how it is utilized in MuZero\u2019s algorithm. MuZero has been successfully tested in both classic board games and visually complex Atari games. MuZero combines deep model-based reinforcement learning with Monte Carlo tree search, achieving the ability to play different games without knowing their rules.", "language": "en", "element": "description", "qualifier": "abstract", "schema": "dc"}, {"key": "dc.description.provenance", "value": "Submitted by Paivi Vuorio (paelvuor@jyu.fi) on 2021-05-12T05:51:42Z\nNo. of bitstreams: 0", "language": "en", "element": "description", "qualifier": "provenance", "schema": "dc"}, {"key": "dc.description.provenance", "value": "Made available in DSpace on 2021-05-12T05:51:42Z (GMT). No. of bitstreams: 0\n Previous issue date: 2021", "language": "en", "element": "description", "qualifier": "provenance", "schema": "dc"}, {"key": "dc.format.extent", "value": "26", "language": "", "element": "format", "qualifier": "extent", "schema": "dc"}, {"key": "dc.language.iso", "value": "fin", "language": null, "element": "language", "qualifier": "iso", "schema": "dc"}, {"key": "dc.rights", "value": "In Copyright", "language": "en", "element": "rights", "qualifier": null, "schema": "dc"}, {"key": "dc.subject.other", "value": "MuZero", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.subject.other", "value": "syv\u00e4oppiminen", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.subject.other", "value": "mallipohjainen vahvistusoppiminen", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.subject.other", "value": "Monte Carlo -puuhaku", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.subject.other", "value": "DeepMind", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.title", "value": "MuZero ja mallipohjainen vahvistusoppiminen", "language": "", "element": "title", "qualifier": null, "schema": "dc"}, {"key": "dc.type", "value": "bachelor thesis", "language": null, "element": "type", "qualifier": null, "schema": "dc"}, {"key": "dc.identifier.urn", "value": "URN:NBN:fi:jyu-202105122744", "language": "", "element": "identifier", "qualifier": "urn", "schema": "dc"}, {"key": "dc.type.ontasot", "value": "Bachelor's thesis", "language": "en", "element": "type", "qualifier": "ontasot", "schema": "dc"}, {"key": "dc.type.ontasot", "value": "Kandidaatinty\u00f6", "language": "fi", "element": "type", "qualifier": "ontasot", "schema": "dc"}, {"key": "dc.contributor.faculty", "value": "Informaatioteknologian tiedekunta", "language": "fi", "element": "contributor", "qualifier": "faculty", "schema": "dc"}, {"key": "dc.contributor.faculty", "value": "Faculty of Information Technology", "language": "en", "element": "contributor", "qualifier": "faculty", "schema": "dc"}, {"key": "dc.contributor.department", "value": "Informaatioteknologia", "language": "fi", "element": "contributor", "qualifier": "department", "schema": "dc"}, {"key": "dc.contributor.department", "value": "Information Technology", "language": "en", "element": "contributor", "qualifier": "department", "schema": "dc"}, {"key": "dc.contributor.organization", "value": "Jyv\u00e4skyl\u00e4n yliopisto", "language": "fi", "element": "contributor", "qualifier": "organization", "schema": "dc"}, {"key": "dc.contributor.organization", "value": "University of Jyv\u00e4skyl\u00e4", "language": "en", "element": "contributor", "qualifier": "organization", "schema": "dc"}, {"key": "dc.subject.discipline", "value": "Tietotekniikka", "language": "fi", "element": "subject", "qualifier": "discipline", "schema": "dc"}, {"key": "dc.subject.discipline", "value": "Mathematical Information Technology", "language": "en", "element": "subject", "qualifier": "discipline", "schema": "dc"}, {"key": "yvv.contractresearch.funding", "value": "0", "language": "", "element": "contractresearch", "qualifier": "funding", "schema": "yvv"}, {"key": "dc.type.coar", "value": "http://purl.org/coar/resource_type/c_7a1f", "language": null, "element": "type", "qualifier": "coar", "schema": "dc"}, {"key": "dc.rights.accesslevel", "value": "openAccess", "language": null, "element": "rights", "qualifier": "accesslevel", "schema": "dc"}, {"key": "dc.type.publication", "value": "bachelorThesis", "language": null, "element": "type", "qualifier": "publication", "schema": "dc"}, {"key": "dc.subject.oppiainekoodi", "value": "602", "language": "", "element": "subject", "qualifier": "oppiainekoodi", "schema": "dc"}, {"key": "dc.subject.yso", "value": "teko\u00e4ly", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "algoritmit", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "Monte Carlo -menetelm\u00e4t", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "tietotekniikka", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "pelit", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "koneoppiminen", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "lautapelit", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.rights.url", "value": "https://rightsstatements.org/page/InC/1.0/", "language": null, "element": "rights", "qualifier": "url", "schema": "dc"}]
id jyx.123456789_75464
language fin
last_indexed 2025-02-18T10:54:42Z
main_date 2021-01-01T00:00:00Z
main_date_str 2021
online_boolean 1
online_urls_str_mv {"url":"https:\/\/jyx.jyu.fi\/bitstreams\/717cbe4e-e25f-4942-8538-24306c078534\/download","text":"URN:NBN:fi:jyu-202105122744.pdf","source":"jyx","mediaType":"application\/pdf"}
publishDate 2021
record_format qdc
source_str_mv jyx
spellingShingle Leinonen, Hertta MuZero ja mallipohjainen vahvistusoppiminen MuZero syväoppiminen mallipohjainen vahvistusoppiminen Monte Carlo -puuhaku DeepMind Tietotekniikka Mathematical Information Technology 602 tekoäly algoritmit Monte Carlo -menetelmät tietotekniikka pelit koneoppiminen lautapelit
title MuZero ja mallipohjainen vahvistusoppiminen
title_full MuZero ja mallipohjainen vahvistusoppiminen
title_fullStr MuZero ja mallipohjainen vahvistusoppiminen MuZero ja mallipohjainen vahvistusoppiminen
title_full_unstemmed MuZero ja mallipohjainen vahvistusoppiminen MuZero ja mallipohjainen vahvistusoppiminen
title_short MuZero ja mallipohjainen vahvistusoppiminen
title_sort muzero ja mallipohjainen vahvistusoppiminen
title_txtP MuZero ja mallipohjainen vahvistusoppiminen
topic MuZero syväoppiminen mallipohjainen vahvistusoppiminen Monte Carlo -puuhaku DeepMind Tietotekniikka Mathematical Information Technology 602 tekoäly algoritmit Monte Carlo -menetelmät tietotekniikka pelit koneoppiminen lautapelit
topic_facet 602 DeepMind Mathematical Information Technology Monte Carlo -menetelmät Monte Carlo -puuhaku MuZero Tietotekniikka algoritmit koneoppiminen lautapelit mallipohjainen vahvistusoppiminen pelit syväoppiminen tekoäly tietotekniikka
url https://jyx.jyu.fi/handle/123456789/75464 http://www.urn.fi/URN:NBN:fi:jyu-202105122744
work_keys_str_mv AT leinonenhertta muzerojamallipohjainenvahvistusoppiminen