Lokien lajittelu koneoppivien menetelmien avulla

Poikkeamantunnistus ja tietoturvapoikkeamien hallinta perustuu järjestelmistä kerättävään tapahtuma- ja lokitietoon. Tietojärjestelmien kasvava käyttö ja monimutkaisuus kasvattaa samalla kertyvää lokia ja sen keräämiseen, järjestelyyn ja analysointiin tarvitaan uusia menetelmiä. Tutkimuksessa analys...

Full description

Bibliographic Details
Main Author: Hiltunen, Jouni
Other Authors: Informaatioteknologian tiedekunta, Faculty of Information Technology, Informaatioteknologia, Information Technology, Jyväskylän yliopisto, University of Jyväskylä
Format: Master's thesis
Language:fin
Published: 2022
Subjects:
Online Access: https://jyx.jyu.fi/handle/123456789/81192
_version_ 1826225756008611840
author Hiltunen, Jouni
author2 Informaatioteknologian tiedekunta Faculty of Information Technology Informaatioteknologia Information Technology Jyväskylän yliopisto University of Jyväskylä
author_facet Hiltunen, Jouni Informaatioteknologian tiedekunta Faculty of Information Technology Informaatioteknologia Information Technology Jyväskylän yliopisto University of Jyväskylä Hiltunen, Jouni Informaatioteknologian tiedekunta Faculty of Information Technology Informaatioteknologia Information Technology Jyväskylän yliopisto University of Jyväskylä
author_sort Hiltunen, Jouni
datasource_str_mv jyx
description Poikkeamantunnistus ja tietoturvapoikkeamien hallinta perustuu järjestelmistä kerättävään tapahtuma- ja lokitietoon. Tietojärjestelmien kasvava käyttö ja monimutkaisuus kasvattaa samalla kertyvää lokia ja sen keräämiseen, järjestelyyn ja analysointiin tarvitaan uusia menetelmiä. Tutkimuksessa analysoitiin lokien käyttökohteita ja pyrittiin löytämään keinoja hyödyntää koneoppivia järjestelmiä lokien järjestämiseksi suunnittelututkimuksen menetelmin. Pyrkimyksenä oli löytää menetelmät ja työkalut, joiden avulla monimuotoisista lokilähteistä kertyvät erimuotoiset lokimerkinnät voidaan ryhmitellä samaan tapahtumaan liittyviksi joukoiksi ennen poikkeamantunnistusta ja sen hallintaa. Tutkimuksessa havaittiin, että tietosuojasäännösten vaatima loki, tietojärjestelmien ylläpidossa käytetty loki ja poikkeamantunnistuksessa seurattu loki asettavat kukin omat vaatimuksensa lokien käsittelyssä käytetylle järjestelmälle. Tietoturvan seurannassa lokimerkintöjä voidaan käsitellä alkiojoukkoina, joiden analysoinnissa tiedonlouhintamenetelmät erityisesti Frequent Pattern Mining ja Frequent Pattern Tree ovat käyttökelpoisia lokimassan toistuvien rakenteiden tunnistamisessa, jotka voidaan syöttää koneoppiville järjestelmille tietoturvapoikkeamien tunnistamiseksi. Lokien keräykseen, esikäsittelyyn ja varastointiin voidaan hyödyntää samoja laskentaresursseja hajautetun tietoaltaan muodossa, joka tarjoaa skaalautuvan ja kustannustehokkaan ratkaisun suurien datamassojen käsittelyyn Anomaly detection and Security Incident Management is done with event and log data gathered from systems. Constantly growing size, use and complexity in information systems grows the amount of logs formed and new methods are needed to gather, index and analyze them. This study analyzes the log use cases and attempts to find ways to utilize machine learning in log indexing using the design study method. Target was to find methods and tools to group hetereogenerous log entries from different systems by event for anomaly detection and incident management purposes. It was found that audit logs required by regulations, security logs used for Information Security Management and event logs used for system maintenance set different demands for log management system. It was found that log entries can be considered as group of items and therefore analyzed using data mining methods. Frequent Pattern Mining and Frequent Pattern Tree were found useful in identifying recurring pat-terns in logs. Frequent patterns can be subsequently used as an input for machine learning systems to identify security incidents. Distributed datalake was found practical in gathering, preprocessing and mining logs in large masses.
first_indexed 2022-05-20T20:00:37Z
format Pro gradu
free_online_boolean 1
fullrecord [{"key": "dc.contributor.advisor", "value": "Lehto, Martti", "language": "", "element": "contributor", "qualifier": "advisor", "schema": "dc"}, {"key": "dc.contributor.author", "value": "Hiltunen, Jouni", "language": "", "element": "contributor", "qualifier": "author", "schema": "dc"}, {"key": "dc.date.accessioned", "value": "2022-05-20T06:52:22Z", "language": null, "element": "date", "qualifier": "accessioned", "schema": "dc"}, {"key": "dc.date.available", "value": "2022-05-20T06:52:22Z", "language": null, "element": "date", "qualifier": "available", "schema": "dc"}, {"key": "dc.date.issued", "value": "2022", "language": "", "element": "date", "qualifier": "issued", "schema": "dc"}, {"key": "dc.identifier.uri", "value": "https://jyx.jyu.fi/handle/123456789/81192", "language": null, "element": "identifier", "qualifier": "uri", "schema": "dc"}, {"key": "dc.description.abstract", "value": "Poikkeamantunnistus ja tietoturvapoikkeamien hallinta perustuu j\u00e4rjestelmist\u00e4\nker\u00e4tt\u00e4v\u00e4\u00e4n tapahtuma- ja lokitietoon. Tietoj\u00e4rjestelmien kasvava k\u00e4ytt\u00f6 ja monimutkaisuus kasvattaa samalla kertyv\u00e4\u00e4 lokia ja sen ker\u00e4\u00e4miseen, j\u00e4rjestelyyn ja analysointiin tarvitaan uusia menetelmi\u00e4. Tutkimuksessa analysoitiin lokien\nk\u00e4ytt\u00f6kohteita ja pyrittiin l\u00f6yt\u00e4m\u00e4\u00e4n keinoja hy\u00f6dynt\u00e4\u00e4 koneoppivia j\u00e4rjestelmi\u00e4 lokien j\u00e4rjest\u00e4miseksi suunnittelututkimuksen menetelmin. Pyrkimyksen\u00e4 oli l\u00f6yt\u00e4\u00e4 menetelm\u00e4t ja ty\u00f6kalut, joiden avulla monimuotoisista lokil\u00e4hteist\u00e4 kertyv\u00e4t erimuotoiset lokimerkinn\u00e4t voidaan ryhmitell\u00e4 samaan tapahtumaan liittyviksi joukoiksi ennen poikkeamantunnistusta ja sen hallintaa. Tutkimuksessa havaittiin, ett\u00e4 tietosuojas\u00e4\u00e4nn\u00f6sten vaatima loki, tietoj\u00e4rjestelmien yll\u00e4pidossa k\u00e4ytetty loki ja poikkeamantunnistuksessa seurattu loki asettavat kukin omat\nvaatimuksensa lokien k\u00e4sittelyss\u00e4 k\u00e4ytetylle j\u00e4rjestelm\u00e4lle. Tietoturvan seurannassa lokimerkint\u00f6j\u00e4 voidaan k\u00e4sitell\u00e4 alkiojoukkoina, joiden analysoinnissa tiedonlouhintamenetelm\u00e4t erityisesti Frequent Pattern Mining ja Frequent Pattern\nTree ovat k\u00e4ytt\u00f6kelpoisia lokimassan toistuvien rakenteiden tunnistamisessa, jotka voidaan sy\u00f6tt\u00e4\u00e4 koneoppiville j\u00e4rjestelmille tietoturvapoikkeamien tunnistamiseksi. Lokien ker\u00e4ykseen, esik\u00e4sittelyyn ja varastointiin voidaan hy\u00f6dynt\u00e4\u00e4 samoja laskentaresursseja hajautetun tietoaltaan muodossa, joka tarjoaa skaalautuvan ja kustannustehokkaan ratkaisun suurien datamassojen k\u00e4sittelyyn", "language": "fi", "element": "description", "qualifier": "abstract", "schema": "dc"}, {"key": "dc.description.abstract", "value": "Anomaly detection and Security Incident Management is done with event and log data gathered from systems. Constantly growing size, use and complexity in information systems grows the amount of logs formed and new methods are needed to gather, index and analyze them. This study analyzes the log use cases and attempts to find ways to utilize machine learning in log indexing using the design study method. Target was to find methods and tools to group hetereogenerous log entries from different systems by event for anomaly detection and incident management purposes. It was found that audit logs required by regulations, security logs used for Information Security Management and event logs used for system maintenance set different demands for log management system. It was found that log entries can be considered as group of items and therefore analyzed using data mining methods. Frequent Pattern Mining and Frequent Pattern Tree were found useful in identifying recurring pat-terns in logs. Frequent patterns can be subsequently used as an input for machine learning systems to identify security incidents. Distributed datalake was found practical in gathering, preprocessing and mining logs in large masses.", "language": "en", "element": "description", "qualifier": "abstract", "schema": "dc"}, {"key": "dc.description.provenance", "value": "Submitted by Paivi Vuorio (paelvuor@jyu.fi) on 2022-05-20T06:52:22Z\nNo. of bitstreams: 0", "language": "en", "element": "description", "qualifier": "provenance", "schema": "dc"}, {"key": "dc.description.provenance", "value": "Made available in DSpace on 2022-05-20T06:52:22Z (GMT). No. of bitstreams: 0\n Previous issue date: 2022", "language": "en", "element": "description", "qualifier": "provenance", "schema": "dc"}, {"key": "dc.format.extent", "value": "41", "language": "", "element": "format", "qualifier": "extent", "schema": "dc"}, {"key": "dc.format.mimetype", "value": "application/pdf", "language": null, "element": "format", "qualifier": "mimetype", "schema": "dc"}, {"key": "dc.language.iso", "value": "fin", "language": null, "element": "language", "qualifier": "iso", "schema": "dc"}, {"key": "dc.rights", "value": "In Copyright", "language": "en", "element": "rights", "qualifier": null, "schema": "dc"}, {"key": "dc.subject.other", "value": "lokienhallinta", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.subject.other", "value": "poikkeamantunnistus", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.title", "value": "Lokien lajittelu koneoppivien menetelmien avulla", "language": "", "element": "title", "qualifier": null, "schema": "dc"}, {"key": "dc.type", "value": "master thesis", "language": null, "element": "type", "qualifier": null, "schema": "dc"}, {"key": "dc.identifier.urn", "value": "URN:NBN:fi:jyu-202205202824", "language": "", "element": "identifier", "qualifier": "urn", "schema": "dc"}, {"key": "dc.type.ontasot", "value": "Pro gradu -tutkielma", "language": "fi", "element": "type", "qualifier": "ontasot", "schema": "dc"}, {"key": "dc.type.ontasot", "value": "Master\u2019s thesis", "language": "en", "element": "type", "qualifier": "ontasot", "schema": "dc"}, {"key": "dc.contributor.faculty", "value": "Informaatioteknologian tiedekunta", "language": "fi", "element": "contributor", "qualifier": "faculty", "schema": "dc"}, {"key": "dc.contributor.faculty", "value": "Faculty of Information Technology", "language": "en", "element": "contributor", "qualifier": "faculty", "schema": "dc"}, {"key": "dc.contributor.department", "value": "Informaatioteknologia", "language": "fi", "element": "contributor", "qualifier": "department", "schema": "dc"}, {"key": "dc.contributor.department", "value": "Information Technology", "language": "en", "element": "contributor", "qualifier": "department", "schema": "dc"}, {"key": "dc.contributor.organization", "value": "Jyv\u00e4skyl\u00e4n yliopisto", "language": "fi", "element": "contributor", "qualifier": "organization", "schema": "dc"}, {"key": "dc.contributor.organization", "value": "University of Jyv\u00e4skyl\u00e4", "language": "en", "element": "contributor", "qualifier": "organization", "schema": "dc"}, {"key": "dc.subject.discipline", "value": "Kyberturvallisuus", "language": "fi", "element": "subject", "qualifier": "discipline", "schema": "dc"}, {"key": "dc.subject.discipline", "value": "Kyberturvallisuus", "language": "en", "element": "subject", "qualifier": "discipline", "schema": "dc"}, {"key": "yvv.contractresearch.funding", "value": "0", "language": "", "element": "contractresearch", "qualifier": "funding", "schema": "yvv"}, {"key": "dc.type.coar", "value": "http://purl.org/coar/resource_type/c_bdcc", "language": null, "element": "type", "qualifier": "coar", "schema": "dc"}, {"key": "dc.rights.accesslevel", "value": "openAccess", "language": null, "element": "rights", "qualifier": "accesslevel", "schema": "dc"}, {"key": "dc.type.publication", "value": "masterThesis", "language": null, "element": "type", "qualifier": "publication", "schema": "dc"}, {"key": "dc.subject.oppiainekoodi", "value": "601", "language": "", "element": "subject", "qualifier": "oppiainekoodi", "schema": "dc"}, {"key": "dc.subject.yso", "value": "lokitiedostot", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "tiedonlouhinta", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "analyysi", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "tietojenk\u00e4sittely", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "teko\u00e4ly", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "menetelm\u00e4t", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.format.content", "value": "fulltext", "language": null, "element": "format", "qualifier": "content", "schema": "dc"}, {"key": "dc.rights.url", "value": "https://rightsstatements.org/page/InC/1.0/", "language": null, "element": "rights", "qualifier": "url", "schema": "dc"}, {"key": "dc.type.okm", "value": "G2", "language": null, "element": "type", "qualifier": "okm", "schema": "dc"}]
id jyx.123456789_81192
language fin
last_indexed 2025-02-18T10:55:39Z
main_date 2022-01-01T00:00:00Z
main_date_str 2022
online_boolean 1
online_urls_str_mv {"url":"https:\/\/jyx.jyu.fi\/bitstreams\/8ac3e4df-5758-4ee8-b7b1-72acb13ce08e\/download","text":"URN:NBN:fi:jyu-202205202824.pdf","source":"jyx","mediaType":"application\/pdf"}
publishDate 2022
record_format qdc
source_str_mv jyx
spellingShingle Hiltunen, Jouni Lokien lajittelu koneoppivien menetelmien avulla lokienhallinta poikkeamantunnistus Kyberturvallisuus 601 lokitiedostot tiedonlouhinta analyysi tietojenkäsittely tekoäly menetelmät
title Lokien lajittelu koneoppivien menetelmien avulla
title_full Lokien lajittelu koneoppivien menetelmien avulla
title_fullStr Lokien lajittelu koneoppivien menetelmien avulla Lokien lajittelu koneoppivien menetelmien avulla
title_full_unstemmed Lokien lajittelu koneoppivien menetelmien avulla Lokien lajittelu koneoppivien menetelmien avulla
title_short Lokien lajittelu koneoppivien menetelmien avulla
title_sort lokien lajittelu koneoppivien menetelmien avulla
title_txtP Lokien lajittelu koneoppivien menetelmien avulla
topic lokienhallinta poikkeamantunnistus Kyberturvallisuus 601 lokitiedostot tiedonlouhinta analyysi tietojenkäsittely tekoäly menetelmät
topic_facet 601 Kyberturvallisuus analyysi lokienhallinta lokitiedostot menetelmät poikkeamantunnistus tekoäly tiedonlouhinta tietojenkäsittely
url https://jyx.jyu.fi/handle/123456789/81192 http://www.urn.fi/URN:NBN:fi:jyu-202205202824
work_keys_str_mv AT hiltunenjouni lokienlajittelukoneoppivienmenetelmienavulla