Curiosity-driven algorithm for reinforcement learning

One problem of current Reinforcement Learning algorithms is finding a balance between exploitation of existing knowledge and exploration for a new experience. Curiosity exploration bonus has been proposed to address this problem, but current implementations are vulnerable to stochastic noise inside...

Täydet tiedot

Bibliografiset tiedot
Päätekijä: Tsybulko, Vitalii
Muut tekijät: Informaatioteknologian tiedekunta, Faculty of Information Technology, Informaatioteknologia, Information Technology, Jyväskylän yliopisto, University of Jyväskylä
Aineistotyyppi: Pro gradu
Kieli:eng
Julkaistu: 2019
Aiheet:
Linkit: https://jyx.jyu.fi/handle/123456789/64268
_version_ 1828193081588973568
author Tsybulko, Vitalii
author2 Informaatioteknologian tiedekunta Faculty of Information Technology Informaatioteknologia Information Technology Jyväskylän yliopisto University of Jyväskylä
author_facet Tsybulko, Vitalii Informaatioteknologian tiedekunta Faculty of Information Technology Informaatioteknologia Information Technology Jyväskylän yliopisto University of Jyväskylä Tsybulko, Vitalii Informaatioteknologian tiedekunta Faculty of Information Technology Informaatioteknologia Information Technology Jyväskylän yliopisto University of Jyväskylä
author_sort Tsybulko, Vitalii
datasource_str_mv jyx
description One problem of current Reinforcement Learning algorithms is finding a balance between exploitation of existing knowledge and exploration for a new experience. Curiosity exploration bonus has been proposed to address this problem, but current implementations are vulnerable to stochastic noise inside the environment. The new approach presented in this thesis utilises exploration bonus based on the predicted novelty of the next state. That protects exploration from noise issues during training. This work also introduces a new way of combining extrinsic and intrinsic rewards. Both improvements help to overcome a number of problems that Reinforcement Learning had until now.
first_indexed 2019-09-20T09:13:53Z
format Pro gradu
free_online_boolean 1
fullrecord [{"key": "dc.contributor.advisor", "value": "Terziyan, Vagan", "language": "", "element": "contributor", "qualifier": "advisor", "schema": "dc"}, {"key": "dc.contributor.author", "value": "Tsybulko, Vitalii", "language": "", "element": "contributor", "qualifier": "author", "schema": "dc"}, {"key": "dc.date.accessioned", "value": "2019-05-29T06:28:43Z", "language": null, "element": "date", "qualifier": "accessioned", "schema": "dc"}, {"key": "dc.date.available", "value": "2019-05-29T06:28:43Z", "language": null, "element": "date", "qualifier": "available", "schema": "dc"}, {"key": "dc.date.issued", "value": "2019", "language": "", "element": "date", "qualifier": "issued", "schema": "dc"}, {"key": "dc.identifier.uri", "value": "https://jyx.jyu.fi/handle/123456789/64268", "language": null, "element": "identifier", "qualifier": "uri", "schema": "dc"}, {"key": "dc.description.abstract", "value": "One problem of current Reinforcement Learning algorithms is finding a balance between exploitation of existing knowledge and exploration for a new experience. Curiosity exploration bonus has been proposed to address this problem, but current implementations are vulnerable to stochastic noise inside the environment. The new approach presented in this thesis utilises exploration bonus based on the predicted novelty of the next state. That protects exploration from noise issues during training. This work also introduces a new way of combining extrinsic and intrinsic rewards. Both improvements help to overcome a number of problems that Reinforcement Learning had until now.", "language": "en", "element": "description", "qualifier": "abstract", "schema": "dc"}, {"key": "dc.description.provenance", "value": "Submitted by Paivi Vuorio (paelvuor@jyu.fi) on 2019-05-29T06:28:42Z\nNo. of bitstreams: 0", "language": "en", "element": "description", "qualifier": "provenance", "schema": "dc"}, {"key": "dc.description.provenance", "value": "Made available in DSpace on 2019-05-29T06:28:43Z (GMT). No. of bitstreams: 0\n Previous issue date: 2019", "language": "en", "element": "description", "qualifier": "provenance", "schema": "dc"}, {"key": "dc.format.extent", "value": "63", "language": "", "element": "format", "qualifier": "extent", "schema": "dc"}, {"key": "dc.format.mimetype", "value": "application/pdf", "language": null, "element": "format", "qualifier": "mimetype", "schema": "dc"}, {"key": "dc.language.iso", "value": "eng", "language": null, "element": "language", "qualifier": "iso", "schema": "dc"}, {"key": "dc.rights", "value": "In Copyright", "language": "en", "element": "rights", "qualifier": null, "schema": "dc"}, {"key": "dc.subject.other", "value": "reinforcement learning", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.subject.other", "value": "proximal policy optimisation", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.subject.other", "value": "curiosity-driven exploration bonus", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.title", "value": "Curiosity-driven algorithm for reinforcement learning", "language": "", "element": "title", "qualifier": null, "schema": "dc"}, {"key": "dc.type", "value": "master thesis", "language": null, "element": "type", "qualifier": null, "schema": "dc"}, {"key": "dc.identifier.urn", "value": "URN:NBN:fi:jyu-201905292863", "language": "", "element": "identifier", "qualifier": "urn", "schema": "dc"}, {"key": "dc.type.ontasot", "value": "Pro gradu -tutkielma", "language": "fi", "element": "type", "qualifier": "ontasot", "schema": "dc"}, {"key": "dc.type.ontasot", "value": "Master\u2019s thesis", "language": "en", "element": "type", "qualifier": "ontasot", "schema": "dc"}, {"key": "dc.contributor.faculty", "value": "Informaatioteknologian tiedekunta", "language": "fi", "element": "contributor", "qualifier": "faculty", "schema": "dc"}, {"key": "dc.contributor.faculty", "value": "Faculty of Information Technology", "language": "en", "element": "contributor", "qualifier": "faculty", "schema": "dc"}, {"key": "dc.contributor.department", "value": "Informaatioteknologia", "language": "fi", "element": "contributor", "qualifier": "department", "schema": "dc"}, {"key": "dc.contributor.department", "value": "Information Technology", "language": "en", "element": "contributor", "qualifier": "department", "schema": "dc"}, {"key": "dc.contributor.organization", "value": "Jyv\u00e4skyl\u00e4n yliopisto", "language": "fi", "element": "contributor", "qualifier": "organization", "schema": "dc"}, {"key": "dc.contributor.organization", "value": "University of Jyv\u00e4skyl\u00e4", "language": "en", "element": "contributor", "qualifier": "organization", "schema": "dc"}, {"key": "dc.subject.discipline", "value": "Tietotekniikka", "language": "fi", "element": "subject", "qualifier": "discipline", "schema": "dc"}, {"key": "dc.subject.discipline", "value": "Mathematical Information Technology", "language": "en", "element": "subject", "qualifier": "discipline", "schema": "dc"}, {"key": "yvv.contractresearch.funding", "value": "0", "language": "", "element": "contractresearch", "qualifier": "funding", "schema": "yvv"}, {"key": "dc.type.coar", "value": "http://purl.org/coar/resource_type/c_bdcc", "language": null, "element": "type", "qualifier": "coar", "schema": "dc"}, {"key": "dc.rights.accesslevel", "value": "openAccess", "language": null, "element": "rights", "qualifier": "accesslevel", "schema": "dc"}, {"key": "dc.type.publication", "value": "masterThesis", "language": null, "element": "type", "qualifier": "publication", "schema": "dc"}, {"key": "dc.subject.oppiainekoodi", "value": "602", "language": "", "element": "subject", "qualifier": "oppiainekoodi", "schema": "dc"}, {"key": "dc.subject.yso", "value": "teko\u00e4ly", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "koneoppiminen", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "palkitseminen", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "artificial intelligence", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "machine learning", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "rewarding", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.format.content", "value": "fulltext", "language": null, "element": "format", "qualifier": "content", "schema": "dc"}, {"key": "dc.rights.url", "value": "https://rightsstatements.org/page/InC/1.0/", "language": null, "element": "rights", "qualifier": "url", "schema": "dc"}, {"key": "dc.type.okm", "value": "G2", "language": null, "element": "type", "qualifier": "okm", "schema": "dc"}]
id jyx.123456789_64268
language eng
last_indexed 2025-03-31T20:01:53Z
main_date 2019-01-01T00:00:00Z
main_date_str 2019
online_boolean 1
online_urls_str_mv {"url":"https:\/\/jyx.jyu.fi\/bitstreams\/12cbc61b-2a66-4440-947a-47207e00dafc\/download","text":"URN:NBN:fi:jyu-201905292863.pdf","source":"jyx","mediaType":"application\/pdf"}
publishDate 2019
record_format qdc
source_str_mv jyx
spellingShingle Tsybulko, Vitalii Curiosity-driven algorithm for reinforcement learning reinforcement learning proximal policy optimisation curiosity-driven exploration bonus Tietotekniikka Mathematical Information Technology 602 tekoäly koneoppiminen palkitseminen artificial intelligence machine learning rewarding
title Curiosity-driven algorithm for reinforcement learning
title_full Curiosity-driven algorithm for reinforcement learning
title_fullStr Curiosity-driven algorithm for reinforcement learning Curiosity-driven algorithm for reinforcement learning
title_full_unstemmed Curiosity-driven algorithm for reinforcement learning Curiosity-driven algorithm for reinforcement learning
title_short Curiosity-driven algorithm for reinforcement learning
title_sort curiosity driven algorithm for reinforcement learning
title_txtP Curiosity-driven algorithm for reinforcement learning
topic reinforcement learning proximal policy optimisation curiosity-driven exploration bonus Tietotekniikka Mathematical Information Technology 602 tekoäly koneoppiminen palkitseminen artificial intelligence machine learning rewarding
topic_facet 602 Mathematical Information Technology Tietotekniikka artificial intelligence curiosity-driven exploration bonus koneoppiminen machine learning palkitseminen proximal policy optimisation reinforcement learning rewarding tekoäly
url https://jyx.jyu.fi/handle/123456789/64268 http://www.urn.fi/URN:NBN:fi:jyu-201905292863
work_keys_str_mv AT tsybulkovitalii curiositydrivenalgorithmforreinforcementlearning