Curiosity-driven algorithm for reinforcement learning

One problem of current Reinforcement Learning algorithms is finding a balance between exploitation of existing knowledge and exploration for a new experience. Curiosity exploration bonus has been proposed to address this problem, but current implementations are vulnerable to stochastic noise inside...

Täydet tiedot

Bibliografiset tiedot
Päätekijä:	Tsybulko, Vitalii
Muut tekijät:	Informaatioteknologian tiedekunta, Faculty of Information Technology, Informaatioteknologia, Information Technology, Jyväskylän yliopisto, University of Jyväskylä
Aineistotyyppi:	Pro gradu
Kieli:	eng
Julkaistu:	2019
Aiheet:	reinforcement learning proximal policy optimisation curiosity-driven exploration bonus Tietotekniikka Mathematical Information Technology 602 tekoäly koneoppiminen palkitseminen artificial intelligence machine learning rewarding
Linkit:	https://jyx.jyu.fi/handle/123456789/64268

_version_	1833407660445138944
author	Tsybulko, Vitalii
author2	Informaatioteknologian tiedekunta Faculty of Information Technology Informaatioteknologia Information Technology Jyväskylän yliopisto University of Jyväskylä
author_facet	Tsybulko, Vitalii Informaatioteknologian tiedekunta Faculty of Information Technology Informaatioteknologia Information Technology Jyväskylän yliopisto University of Jyväskylä Tsybulko, Vitalii Informaatioteknologian tiedekunta Faculty of Information Technology Informaatioteknologia Information Technology Jyväskylän yliopisto University of Jyväskylä
author_sort	Tsybulko, Vitalii
datasource_str_mv	jyx
description	One problem of current Reinforcement Learning algorithms is finding a balance between exploitation of existing knowledge and exploration for a new experience. Curiosity exploration bonus has been proposed to address this problem, but current implementations are vulnerable to stochastic noise inside the environment. The new approach presented in this thesis utilises exploration bonus based on the predicted novelty of the next state. That protects exploration from noise issues during training. This work also introduces a new way of combining extrinsic and intrinsic rewards. Both improvements help to overcome a number of problems that Reinforcement Learning had until now.
first_indexed	2019-09-20T09:13:53Z
format	Pro gradu
free_online_boolean	1
fullrecord	[{"key": "dc.contributor.advisor", "value": "Terziyan, Vagan", "language": "", "element": "contributor", "qualifier": "advisor", "schema": "dc"}, {"key": "dc.contributor.author", "value": "Tsybulko, Vitalii", "language": "", "element": "contributor", "qualifier": "author", "schema": "dc"}, {"key": "dc.date.accessioned", "value": "2019-05-29T06:28:43Z", "language": null, "element": "date", "qualifier": "accessioned", "schema": "dc"}, {"key": "dc.date.available", "value": "2019-05-29T06:28:43Z", "language": null, "element": "date", "qualifier": "available", "schema": "dc"}, {"key": "dc.date.issued", "value": "2019", "language": "", "element": "date", "qualifier": "issued", "schema": "dc"}, {"key": "dc.identifier.uri", "value": "https://jyx.jyu.fi/handle/123456789/64268", "language": null, "element": "identifier", "qualifier": "uri", "schema": "dc"}, {"key": "dc.description.abstract", "value": "One problem of current Reinforcement Learning algorithms is finding a balance between exploitation of existing knowledge and exploration for a new experience. Curiosity exploration bonus has been proposed to address this problem, but current implementations are vulnerable to stochastic noise inside the environment. The new approach presented in this thesis utilises exploration bonus based on the predicted novelty of the next state. That protects exploration from noise issues during training. This work also introduces a new way of combining extrinsic and intrinsic rewards. Both improvements help to overcome a number of problems that Reinforcement Learning had until now.", "language": "en", "element": "description", "qualifier": "abstract", "schema": "dc"}, {"key": "dc.description.provenance", "value": "Submitted by Paivi Vuorio (paelvuor@jyu.fi) on 2019-05-29T06:28:42Z\nNo. of bitstreams: 0", "language": "en", "element": "description", "qualifier": "provenance", "schema": "dc"}, {"key": "dc.description.provenance", "value": "Made available in DSpace on 2019-05-29T06:28:43Z (GMT). No. of bitstreams: 0\n Previous issue date: 2019", "language": "en", "element": "description", "qualifier": "provenance", "schema": "dc"}, {"key": "dc.format.extent", "value": "63", "language": "", "element": "format", "qualifier": "extent", "schema": "dc"}, {"key": "dc.format.mimetype", "value": "application/pdf", "language": null, "element": "format", "qualifier": "mimetype", "schema": "dc"}, {"key": "dc.language.iso", "value": "eng", "language": null, "element": "language", "qualifier": "iso", "schema": "dc"}, {"key": "dc.rights", "value": "In Copyright", "language": "en", "element": "rights", "qualifier": null, "schema": "dc"}, {"key": "dc.subject.other", "value": "reinforcement learning", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.subject.other", "value": "proximal policy optimisation", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.subject.other", "value": "curiosity-driven exploration bonus", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.title", "value": "Curiosity-driven algorithm for reinforcement learning", "language": "", "element": "title", "qualifier": null, "schema": "dc"}, {"key": "dc.type", "value": "master thesis", "language": null, "element": "type", "qualifier": null, "schema": "dc"}, {"key": "dc.identifier.urn", "value": "URN:NBN:fi:jyu-201905292863", "language": "", "element": "identifier", "qualifier": "urn", "schema": "dc"}, {"key": "dc.type.ontasot", "value": "Pro gradu -tutkielma", "language": "fi", "element": "type", "qualifier": "ontasot", "schema": "dc"}, {"key": "dc.type.ontasot", "value": "Master\u2019s thesis", "language": "en", "element": "type", "qualifier": "ontasot", "schema": "dc"}, {"key": "dc.contributor.faculty", "value": "Informaatioteknologian tiedekunta", "language": "fi", "element": "contributor", "qualifier": "faculty", "schema": "dc"}, {"key": "dc.contributor.faculty", "value": "Faculty of Information Technology", "language": "en", "element": "contributor", "qualifier": "faculty", "schema": "dc"}, {"key": "dc.contributor.department", "value": "Informaatioteknologia", "language": "fi", "element": "contributor", "qualifier": "department", "schema": "dc"}, {"key": "dc.contributor.department", "value": "Information Technology", "language": "en", "element": "contributor", "qualifier": "department", "schema": "dc"}, {"key": "dc.contributor.organization", "value": "Jyv\u00e4skyl\u00e4n yliopisto", "language": "fi", "element": "contributor", "qualifier": "organization", "schema": "dc"}, {"key": "dc.contributor.organization", "value": "University of Jyv\u00e4skyl\u00e4", "language": "en", "element": "contributor", "qualifier": "organization", "schema": "dc"}, {"key": "dc.subject.discipline", "value": "Tietotekniikka", "language": "fi", "element": "subject", "qualifier": "discipline", "schema": "dc"}, {"key": "dc.subject.discipline", "value": "Mathematical Information Technology", "language": "en", "element": "subject", "qualifier": "discipline", "schema": "dc"}, {"key": "yvv.contractresearch.funding", "value": "0", "language": "", "element": "contractresearch", "qualifier": "funding", "schema": "yvv"}, {"key": "dc.type.coar", "value": "http://purl.org/coar/resource_type/c_bdcc", "language": null, "element": "type", "qualifier": "coar", "schema": "dc"}, {"key": "dc.rights.accesslevel", "value": "openAccess", "language": null, "element": "rights", "qualifier": "accesslevel", "schema": "dc"}, {"key": "dc.type.publication", "value": "masterThesis", "language": null, "element": "type", "qualifier": "publication", "schema": "dc"}, {"key": "dc.subject.oppiainekoodi", "value": "602", "language": "", "element": "subject", "qualifier": "oppiainekoodi", "schema": "dc"}, {"key": "dc.subject.yso", "value": "teko\u00e4ly", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "koneoppiminen", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "palkitseminen", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "artificial intelligence", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "machine learning", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "rewarding", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.format.content", "value": "fulltext", "language": null, "element": "format", "qualifier": "content", "schema": "dc"}, {"key": "dc.rights.url", "value": "https://rightsstatements.org/page/InC/1.0/", "language": null, "element": "rights", "qualifier": "url", "schema": "dc"}, {"key": "dc.type.okm", "value": "G2", "language": null, "element": "type", "qualifier": "okm", "schema": "dc"}, {"key": "dc.description.accessibilityfeature", "value": "unknown accessibility", "language": "en", "element": "description", "qualifier": "accessibilityfeature", "schema": "dc"}, {"key": "dc.description.accessibilityfeature", "value": "ei tietoa saavutettavuudesta", "language": "fi", "element": "description", "qualifier": "accessibilityfeature", "schema": "dc"}]
id	jyx.123456789_64268
language	eng
last_indexed	2025-05-21T20:06:19Z
main_date	2019-01-01T00:00:00Z
main_date_str	2019
online_boolean	1
online_urls_str_mv	{"url":"https:\/\/jyx.jyu.fi\/bitstreams\/12cbc61b-2a66-4440-947a-47207e00dafc\/download","text":"URN:NBN:fi:jyu-201905292863.pdf","source":"jyx","mediaType":"application\/pdf"}
publishDate	2019
record_format	qdc
source_str_mv	jyx
spellingShingle	Tsybulko, Vitalii Curiosity-driven algorithm for reinforcement learning reinforcement learning proximal policy optimisation curiosity-driven exploration bonus Tietotekniikka Mathematical Information Technology 602 tekoäly koneoppiminen palkitseminen artificial intelligence machine learning rewarding
title	Curiosity-driven algorithm for reinforcement learning
title_full	Curiosity-driven algorithm for reinforcement learning
title_fullStr	Curiosity-driven algorithm for reinforcement learning Curiosity-driven algorithm for reinforcement learning
title_full_unstemmed	Curiosity-driven algorithm for reinforcement learning Curiosity-driven algorithm for reinforcement learning
title_short	Curiosity-driven algorithm for reinforcement learning
title_sort	curiosity driven algorithm for reinforcement learning
title_txtP	Curiosity-driven algorithm for reinforcement learning
topic	reinforcement learning proximal policy optimisation curiosity-driven exploration bonus Tietotekniikka Mathematical Information Technology 602 tekoäly koneoppiminen palkitseminen artificial intelligence machine learning rewarding
topic_facet	602 Mathematical Information Technology Tietotekniikka artificial intelligence curiosity-driven exploration bonus koneoppiminen machine learning palkitseminen proximal policy optimisation reinforcement learning rewarding tekoäly
url	https://jyx.jyu.fi/handle/123456789/64268 http://www.urn.fi/URN:NBN:fi:jyu-201905292863
work_keys_str_mv	AT tsybulkovitalii curiositydrivenalgorithmforreinforcementlearning

Curiosity-driven algorithm for reinforcement learning

Samankaltaisia teoksia