Machine learning in intrusion detection topics from scientific literature

Koneoppimisen ominaisuudet ovat tehneet monista sen menetelmistä käytettyjä hyökkäysten havaitsemisessa. Nykyinen kirjallisuus, joka käsittelee koneoppimista hyökkäysten havaitsemissa, on vailla hyvää yleiskatsausta koko aihealueen kirjallisuuteen. Olemassa olevan datamäärän vuoksi perinteisten meto...

Full description

Bibliographic Details
Main Author: Peronius, Elina
Other Authors: Informaatioteknologian tiedekunta, Faculty of Information Technology, Informaatioteknologia, Information Technology, Jyväskylän yliopisto, University of Jyväskylä
Format: Master's thesis
Language:eng
Published: 2020
Subjects:
Online Access: https://jyx.jyu.fi/handle/123456789/69743
_version_ 1826225754842595328
author Peronius, Elina
author2 Informaatioteknologian tiedekunta Faculty of Information Technology Informaatioteknologia Information Technology Jyväskylän yliopisto University of Jyväskylä
author_facet Peronius, Elina Informaatioteknologian tiedekunta Faculty of Information Technology Informaatioteknologia Information Technology Jyväskylän yliopisto University of Jyväskylä Peronius, Elina Informaatioteknologian tiedekunta Faculty of Information Technology Informaatioteknologia Information Technology Jyväskylän yliopisto University of Jyväskylä
author_sort Peronius, Elina
datasource_str_mv jyx
description Koneoppimisen ominaisuudet ovat tehneet monista sen menetelmistä käytettyjä hyökkäysten havaitsemisessa. Nykyinen kirjallisuus, joka käsittelee koneoppimista hyökkäysten havaitsemissa, on vailla hyvää yleiskatsausta koko aihealueen kirjallisuuteen. Olemassa olevan datamäärän vuoksi perinteisten metodien käyttö data analyysissä olisi työlästä ja tehotonta. Tämä tutkimus lähestyy haastetta käyttämällä automaattista tekstianalyysimenetelmää nimeltä dynaaminen aihemallinnus. Dynaaminen aihemallinnus kykenee tunnistamaan aiheiden kehittymisen ajan myötä, mikä tekee siitä hyvän mallinnusvaihtoehdon käytettäväksi dokumentteihin, jotka kuvaava kehittyvää sisältöä. Dynaamisella aihemallinnuksella löydettiin 21 aihetta, joista 15 oli tulkittavia. Tulkittavat aiheet nimettiin, tosin nimeämisessä heijastuu vain yhden henkilön mielipide. Tämän tutkimuksen tärkeimmät tuotokset ovat nykyisen kirjallisuuden kartoitus. Käytetyt koneoppimisen metodit ovat hyvin tutkittu alue, joka tekee niiden kontekstien, joissa näitä menetelmiä käytetään tunnistamisesta mielenkiintoisemman osan löydöksistä. Useita puutteita tunnistettiin datan keräyksessä, datan prosessoinnissa, mallin evaluoinnissa ja aiheiden tulkinnassa. Tämän vuoksi tulosten validiteetti pitää joissain määrin kyseenalaistaa. Valitun tekstianalyysimenetelmän ominaisuuksien vuoksi tuloksista puuttuu rikkaus, joka yleensä liitetään perinteisiin tutkimusmenetelmiin. Tämän vuoksi lisätutkimuksien aiheiksi ehdotetaan aiheita, jotka pyrkivät korjaamaan tämän puutoksen. Tälle aihealueelle löydettyjen aiheiden tulevaisuuden kehittyminen ja uusien aiheiden ilmaantumisen tunnistaminen olisivat myös hyödyllisiä. Due to the traits of machine learning, many of its techniques are used in intrusion detection. Current literature of machine learning in intrusion detection lacks a good overview of the current research landscape. Due to the amount of existing data, using traditional methods to make sense of the literature would be laborious and ineffective. This study approaches the problem through using automated text analysis method called dynamic topic modelling. Dynamic topic modelling has the ability to capture the evolution of topics, which makes it a good modelling option to use on a document collection reflecting evolving content. Using the model, 21 topics were acquired, where 15 of them were deemed interpretable. Interpretable topics were labelled, though the labelling only reflects the opinion of one person. The main contribution of this study is the mapping of current research landscape. Used machine learning techniques is a well-studied area, which makes the identification of different contexts where machine learning techniques are applied in the more interesting part of the findings. Several limitations can be identified in data collection, data pre-processing, model evaluation and topic interpretation. This means that the validity of the results needs to be questioned to a degree. Due to the nature of the selected text analysis method, the results lack the richness often affiliated with traditional research methods. Due to this, suggestions of further research present topics which aim to combat this short falling. For this area of research, understanding of future evolution of topics and the identification of emerging topics would also be valuable.
first_indexed 2024-09-11T08:49:19Z
format Pro gradu
free_online_boolean 1
fullrecord [{"key": "dc.contributor.advisor", "value": "Lehto, Martti", "language": "", "element": "contributor", "qualifier": "advisor", "schema": "dc"}, {"key": "dc.contributor.author", "value": "Peronius, Elina", "language": "", "element": "contributor", "qualifier": "author", "schema": "dc"}, {"key": "dc.date.accessioned", "value": "2020-06-05T08:10:29Z", "language": null, "element": "date", "qualifier": "accessioned", "schema": "dc"}, {"key": "dc.date.available", "value": "2020-06-05T08:10:29Z", "language": null, "element": "date", "qualifier": "available", "schema": "dc"}, {"key": "dc.date.issued", "value": "2020", "language": "", "element": "date", "qualifier": "issued", "schema": "dc"}, {"key": "dc.identifier.uri", "value": "https://jyx.jyu.fi/handle/123456789/69743", "language": null, "element": "identifier", "qualifier": "uri", "schema": "dc"}, {"key": "dc.description.abstract", "value": "Koneoppimisen ominaisuudet ovat tehneet monista sen menetelmist\u00e4 k\u00e4ytettyj\u00e4 hy\u00f6kk\u00e4ysten havaitsemisessa. Nykyinen kirjallisuus, joka k\u00e4sittelee koneoppimista hy\u00f6kk\u00e4ysten havaitsemissa, on vailla hyv\u00e4\u00e4 yleiskatsausta koko aihealueen kirjallisuuteen. Olemassa olevan datam\u00e4\u00e4r\u00e4n vuoksi perinteisten metodien k\u00e4ytt\u00f6 data analyysiss\u00e4 olisi ty\u00f6l\u00e4st\u00e4 ja tehotonta. T\u00e4m\u00e4 tutkimus l\u00e4hestyy haastetta k\u00e4ytt\u00e4m\u00e4ll\u00e4 automaattista tekstianalyysimenetelm\u00e4\u00e4 nimelt\u00e4 dynaaminen aihemallinnus. Dynaaminen aihemallinnus kykenee tunnistamaan aiheiden kehittymisen ajan my\u00f6t\u00e4, mik\u00e4 tekee siit\u00e4 hyv\u00e4n mallinnusvaihtoehdon k\u00e4ytett\u00e4v\u00e4ksi dokumentteihin, jotka kuvaava kehittyv\u00e4\u00e4 sis\u00e4lt\u00f6\u00e4. Dynaamisella aihemallinnuksella l\u00f6ydettiin 21 aihetta, joista 15 oli tulkittavia. Tulkittavat aiheet nimettiin, tosin nime\u00e4misess\u00e4 heijastuu vain yhden henkil\u00f6n mielipide. T\u00e4m\u00e4n tutkimuksen t\u00e4rkeimm\u00e4t tuotokset ovat nykyisen kirjallisuuden kartoitus. K\u00e4ytetyt koneoppimisen metodit ovat hyvin tutkittu alue, joka tekee niiden kontekstien, joissa n\u00e4it\u00e4 menetelmi\u00e4 k\u00e4ytet\u00e4\u00e4n tunnistamisesta mielenkiintoisemman osan l\u00f6yd\u00f6ksist\u00e4. Useita puutteita tunnistettiin datan ker\u00e4yksess\u00e4, datan prosessoinnissa, mallin evaluoinnissa ja aiheiden tulkinnassa. T\u00e4m\u00e4n vuoksi tulosten validiteetti pit\u00e4\u00e4 joissain m\u00e4\u00e4rin kyseenalaistaa. Valitun tekstianalyysimenetelm\u00e4n ominaisuuksien vuoksi tuloksista puuttuu rikkaus, joka yleens\u00e4 liitet\u00e4\u00e4n perinteisiin tutkimusmenetelmiin. T\u00e4m\u00e4n vuoksi lis\u00e4tutkimuksien aiheiksi ehdotetaan aiheita, jotka pyrkiv\u00e4t korjaamaan t\u00e4m\u00e4n puutoksen. T\u00e4lle aihealueelle l\u00f6ydettyjen aiheiden tulevaisuuden kehittyminen ja uusien aiheiden ilmaantumisen tunnistaminen olisivat my\u00f6s hy\u00f6dyllisi\u00e4.", "language": "fi", "element": "description", "qualifier": "abstract", "schema": "dc"}, {"key": "dc.description.abstract", "value": "Due to the traits of machine learning, many of its techniques are used in intrusion detection. Current literature of machine learning in intrusion detection lacks a good overview of the current research landscape. Due to the amount of existing data, using traditional methods to make sense of the literature would be laborious and ineffective. This study approaches the problem through using automated text analysis method called dynamic topic modelling. Dynamic topic modelling has the ability to capture the evolution of topics, which makes it a good modelling option to use on a document collection reflecting evolving content. Using the model, 21 topics were acquired, where 15 of them were deemed interpretable. Interpretable topics were labelled, though the labelling only reflects the opinion of one person. The main contribution of this study is the mapping of current research landscape. Used machine learning techniques is a well-studied area, which makes the identification of different contexts where machine learning techniques are applied in the more interesting part of the findings. Several limitations can be identified in data collection, data pre-processing, model evaluation and topic interpretation. This means that the validity of the results needs to be questioned to a degree. Due to the nature of the selected text analysis method, the results lack the richness often affiliated with traditional research methods. Due to this, suggestions of further research present topics which aim to combat this short falling. For this area of research, understanding of future evolution of topics and the identification of emerging topics would also be valuable.", "language": "en", "element": "description", "qualifier": "abstract", "schema": "dc"}, {"key": "dc.description.provenance", "value": "Submitted by Paivi Vuorio (paelvuor@jyu.fi) on 2020-06-05T08:10:29Z\nNo. of bitstreams: 0", "language": "en", "element": "description", "qualifier": "provenance", "schema": "dc"}, {"key": "dc.description.provenance", "value": "Made available in DSpace on 2020-06-05T08:10:29Z (GMT). No. of bitstreams: 0\n Previous issue date: 2020", "language": "en", "element": "description", "qualifier": "provenance", "schema": "dc"}, {"key": "dc.format.extent", "value": "53", "language": "", "element": "format", "qualifier": "extent", "schema": "dc"}, {"key": "dc.format.mimetype", "value": "application/pdf", "language": null, "element": "format", "qualifier": "mimetype", "schema": "dc"}, {"key": "dc.language.iso", "value": "eng", "language": null, "element": "language", "qualifier": "iso", "schema": "dc"}, {"key": "dc.rights", "value": "In Copyright", "language": "en", "element": "rights", "qualifier": null, "schema": "dc"}, {"key": "dc.subject.other", "value": "intrusion detection", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.subject.other", "value": "topic modelling", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.title", "value": "Machine learning in intrusion detection : topics from scientific literature", "language": "", "element": "title", "qualifier": null, "schema": "dc"}, {"key": "dc.type", "value": "master thesis", "language": null, "element": "type", "qualifier": null, "schema": "dc"}, {"key": "dc.identifier.urn", "value": "URN:NBN:fi:jyu-202006054000", "language": "", "element": "identifier", "qualifier": "urn", "schema": "dc"}, {"key": "dc.type.ontasot", "value": "Pro gradu -tutkielma", "language": "fi", "element": "type", "qualifier": "ontasot", "schema": "dc"}, {"key": "dc.type.ontasot", "value": "Master\u2019s thesis", "language": "en", "element": "type", "qualifier": "ontasot", "schema": "dc"}, {"key": "dc.contributor.faculty", "value": "Informaatioteknologian tiedekunta", "language": "fi", "element": "contributor", "qualifier": "faculty", "schema": "dc"}, {"key": "dc.contributor.faculty", "value": "Faculty of Information Technology", "language": "en", "element": "contributor", "qualifier": "faculty", "schema": "dc"}, {"key": "dc.contributor.department", "value": "Informaatioteknologia", "language": "fi", "element": "contributor", "qualifier": "department", "schema": "dc"}, {"key": "dc.contributor.department", "value": "Information Technology", "language": "en", "element": "contributor", "qualifier": "department", "schema": "dc"}, {"key": "dc.contributor.organization", "value": "Jyv\u00e4skyl\u00e4n yliopisto", "language": "fi", "element": "contributor", "qualifier": "organization", "schema": "dc"}, {"key": "dc.contributor.organization", "value": "University of Jyv\u00e4skyl\u00e4", "language": "en", "element": "contributor", "qualifier": "organization", "schema": "dc"}, {"key": "dc.subject.discipline", "value": "Tietoj\u00e4rjestelm\u00e4tiede", "language": "fi", "element": "subject", "qualifier": "discipline", "schema": "dc"}, {"key": "dc.subject.discipline", "value": "Information Systems Science", "language": "en", "element": "subject", "qualifier": "discipline", "schema": "dc"}, {"key": "yvv.contractresearch.funding", "value": "0", "language": "", "element": "contractresearch", "qualifier": "funding", "schema": "yvv"}, {"key": "dc.type.coar", "value": "http://purl.org/coar/resource_type/c_bdcc", "language": null, "element": "type", "qualifier": "coar", "schema": "dc"}, {"key": "dc.rights.accesslevel", "value": "openAccess", "language": null, "element": "rights", "qualifier": "accesslevel", "schema": "dc"}, {"key": "dc.type.publication", "value": "masterThesis", "language": null, "element": "type", "qualifier": "publication", "schema": "dc"}, {"key": "dc.subject.oppiainekoodi", "value": "601", "language": "", "element": "subject", "qualifier": "oppiainekoodi", "schema": "dc"}, {"key": "dc.subject.yso", "value": "verkkohy\u00f6kk\u00e4ykset", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "koneoppiminen", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "cyber attacks", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "machine learning", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.format.content", "value": "fulltext", "language": null, "element": "format", "qualifier": "content", "schema": "dc"}, {"key": "dc.rights.url", "value": "https://rightsstatements.org/page/InC/1.0/", "language": null, "element": "rights", "qualifier": "url", "schema": "dc"}, {"key": "dc.type.okm", "value": "G2", "language": null, "element": "type", "qualifier": "okm", "schema": "dc"}]
id jyx.123456789_69743
language eng
last_indexed 2025-02-18T10:56:16Z
main_date 2020-01-01T00:00:00Z
main_date_str 2020
online_boolean 1
online_urls_str_mv {"url":"https:\/\/jyx.jyu.fi\/bitstreams\/b67ceb23-8de9-439e-973a-77b2357fc00c\/download","text":"URN:NBN:fi:jyu-202006054000.pdf","source":"jyx","mediaType":"application\/pdf"}
publishDate 2020
record_format qdc
source_str_mv jyx
spellingShingle Peronius, Elina Machine learning in intrusion detection : topics from scientific literature intrusion detection topic modelling Tietojärjestelmätiede Information Systems Science 601 verkkohyökkäykset koneoppiminen cyber attacks machine learning
title Machine learning in intrusion detection : topics from scientific literature
title_full Machine learning in intrusion detection : topics from scientific literature
title_fullStr Machine learning in intrusion detection : topics from scientific literature Machine learning in intrusion detection : topics from scientific literature
title_full_unstemmed Machine learning in intrusion detection : topics from scientific literature Machine learning in intrusion detection : topics from scientific literature
title_short Machine learning in intrusion detection
title_sort machine learning in intrusion detection topics from scientific literature
title_sub topics from scientific literature
title_txtP Machine learning in intrusion detection : topics from scientific literature
topic intrusion detection topic modelling Tietojärjestelmätiede Information Systems Science 601 verkkohyökkäykset koneoppiminen cyber attacks machine learning
topic_facet 601 Information Systems Science Tietojärjestelmätiede cyber attacks intrusion detection koneoppiminen machine learning topic modelling verkkohyökkäykset
url https://jyx.jyu.fi/handle/123456789/69743 http://www.urn.fi/URN:NBN:fi:jyu-202006054000
work_keys_str_mv AT peroniuselina machinelearninginintrusiondetectiontopicsfromscientificliterature