Study of various machine learning approaches to predict default behavior of a borrower based on transactional dataset

Predicting ‘default’ behavior of borrowers is quite challenging and time consuming, although financial institutions require faster and more reliable decision on loan applications to survive in the competitive market. Availability of huge amount of data makes the work of current credit scoring system...

Full description

Bibliographic Details
Main Author: Hossain, Mohammad Farhad
Other Authors: Informaatioteknologian tiedekunta, Faculty of Information Technology, Informaatioteknologia, Information Technology, Jyväskylän yliopisto, University of Jyväskylä
Format: Master's thesis
Language:eng
Published: 2021
Subjects:
Online Access: https://jyx.jyu.fi/handle/123456789/75073
_version_ 1826225737932210176
author Hossain, Mohammad Farhad
author2 Informaatioteknologian tiedekunta Faculty of Information Technology Informaatioteknologia Information Technology Jyväskylän yliopisto University of Jyväskylä
author_facet Hossain, Mohammad Farhad Informaatioteknologian tiedekunta Faculty of Information Technology Informaatioteknologia Information Technology Jyväskylän yliopisto University of Jyväskylä Hossain, Mohammad Farhad Informaatioteknologian tiedekunta Faculty of Information Technology Informaatioteknologia Information Technology Jyväskylän yliopisto University of Jyväskylä
author_sort Hossain, Mohammad Farhad
datasource_str_mv jyx
description Predicting ‘default’ behavior of borrowers is quite challenging and time consuming, although financial institutions require faster and more reliable decision on loan applications to survive in the competitive market. Availability of huge amount of data makes the work of current credit scoring system harder. To deal with such situation machine learning engineers are trying to build a system that can predict default behavior of a borrower by analyzing application and transaction data. In our current study we applied different machine learning models such as decision tree, logistic regression, gradient boosting, XGBoosting, support vector machine and KNeighbors on transactional dataset to find which model performed better. We also applied deep neural network on the datasets. To further extend the study, we created new features by using manual process and unsupervised machine learning to observe whether they boost the performance or not. In addition to that, we used feature selection to see how it affected the prediction. Due to small dataset, we achieved 70% ac-curacy with 72% AUC on aggregated dataset from Random Forest. The dataset created by using unsupervised machine learning showed 62% accuracy with 68% AUC value. Manually created ratio-based features and feature selection could not yield any significant difference in results. Deep learning also per-formed lower than others probably due to small dataset.
first_indexed 2021-04-16T20:01:26Z
format Pro gradu
free_online_boolean 1
fullrecord [{"key": "dc.contributor.advisor", "value": "Khriyenko, Oleksiy", "language": "", "element": "contributor", "qualifier": "advisor", "schema": "dc"}, {"key": "dc.contributor.advisor", "value": "Karimova, Rahima", "language": "", "element": "contributor", "qualifier": "advisor", "schema": "dc"}, {"key": "dc.contributor.advisor", "value": "Fredstr\u00f6m, Ashkan", "language": "", "element": "contributor", "qualifier": "advisor", "schema": "dc"}, {"key": "dc.contributor.author", "value": "Hossain, Mohammad Farhad", "language": "", "element": "contributor", "qualifier": "author", "schema": "dc"}, {"key": "dc.date.accessioned", "value": "2021-04-16T05:25:03Z", "language": null, "element": "date", "qualifier": "accessioned", "schema": "dc"}, {"key": "dc.date.available", "value": "2021-04-16T05:25:03Z", "language": null, "element": "date", "qualifier": "available", "schema": "dc"}, {"key": "dc.date.issued", "value": "2021", "language": "", "element": "date", "qualifier": "issued", "schema": "dc"}, {"key": "dc.identifier.uri", "value": "https://jyx.jyu.fi/handle/123456789/75073", "language": null, "element": "identifier", "qualifier": "uri", "schema": "dc"}, {"key": "dc.description.abstract", "value": "Predicting \u2018default\u2019 behavior of borrowers is quite challenging and time consuming, although financial institutions require faster and more reliable decision on loan applications to survive in the competitive market. Availability of huge amount of data makes the work of current credit scoring system harder. To deal with such situation machine learning engineers are trying to build a system that can predict default behavior of a borrower by analyzing application and transaction data. In our current study we applied different machine learning models such as decision tree, logistic regression, gradient boosting, XGBoosting, support vector machine and KNeighbors on transactional dataset to find which model performed better. We also applied deep neural network on the datasets. To further extend the study, we created new features by using manual process and unsupervised machine learning to observe whether they boost the performance or not. In addition to that, we used feature selection to see how it affected the prediction. Due to small dataset, we achieved 70% ac-curacy with 72% AUC on aggregated dataset from Random Forest. The dataset created by using unsupervised machine learning showed 62% accuracy with 68% AUC value. Manually created ratio-based features and feature selection could not yield any significant difference in results. Deep learning also per-formed lower than others probably due to small dataset.", "language": "en", "element": "description", "qualifier": "abstract", "schema": "dc"}, {"key": "dc.description.provenance", "value": "Submitted by Paivi Vuorio (paelvuor@jyu.fi) on 2021-04-16T05:25:03Z\nNo. of bitstreams: 0", "language": "en", "element": "description", "qualifier": "provenance", "schema": "dc"}, {"key": "dc.description.provenance", "value": "Made available in DSpace on 2021-04-16T05:25:03Z (GMT). No. of bitstreams: 0\n Previous issue date: 2021", "language": "en", "element": "description", "qualifier": "provenance", "schema": "dc"}, {"key": "dc.format.extent", "value": "56", "language": "", "element": "format", "qualifier": "extent", "schema": "dc"}, {"key": "dc.format.mimetype", "value": "application/pdf", "language": null, "element": "format", "qualifier": "mimetype", "schema": "dc"}, {"key": "dc.language.iso", "value": "eng", "language": null, "element": "language", "qualifier": "iso", "schema": "dc"}, {"key": "dc.rights", "value": "In Copyright", "language": "en", "element": "rights", "qualifier": null, "schema": "dc"}, {"key": "dc.subject.other", "value": "deep learning", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.subject.other", "value": "credit scoring", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.subject.other", "value": "transaction data", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.subject.other", "value": "default behavior", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.subject.other", "value": "loan application", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.title", "value": "Study of various machine learning approaches to predict default behavior of a borrower based on transactional dataset", "language": "", "element": "title", "qualifier": null, "schema": "dc"}, {"key": "dc.type", "value": "master thesis", "language": null, "element": "type", "qualifier": null, "schema": "dc"}, {"key": "dc.identifier.urn", "value": "URN:NBN:fi:jyu-202104162383", "language": "", "element": "identifier", "qualifier": "urn", "schema": "dc"}, {"key": "dc.type.ontasot", "value": "Pro gradu -tutkielma", "language": "fi", "element": "type", "qualifier": "ontasot", "schema": "dc"}, {"key": "dc.type.ontasot", "value": "Master\u2019s thesis", "language": "en", "element": "type", "qualifier": "ontasot", "schema": "dc"}, {"key": "dc.contributor.faculty", "value": "Informaatioteknologian tiedekunta", "language": "fi", "element": "contributor", "qualifier": "faculty", "schema": "dc"}, {"key": "dc.contributor.faculty", "value": "Faculty of Information Technology", "language": "en", "element": "contributor", "qualifier": "faculty", "schema": "dc"}, {"key": "dc.contributor.department", "value": "Informaatioteknologia", "language": "fi", "element": "contributor", "qualifier": "department", "schema": "dc"}, {"key": "dc.contributor.department", "value": "Information Technology", "language": "en", "element": "contributor", "qualifier": "department", "schema": "dc"}, {"key": "dc.contributor.organization", "value": "Jyv\u00e4skyl\u00e4n yliopisto", "language": "fi", "element": "contributor", "qualifier": "organization", "schema": "dc"}, {"key": "dc.contributor.organization", "value": "University of Jyv\u00e4skyl\u00e4", "language": "en", "element": "contributor", "qualifier": "organization", "schema": "dc"}, {"key": "dc.subject.discipline", "value": "Tietotekniikka", "language": "fi", "element": "subject", "qualifier": "discipline", "schema": "dc"}, {"key": "dc.subject.discipline", "value": "Mathematical Information Technology", "language": "en", "element": "subject", "qualifier": "discipline", "schema": "dc"}, {"key": "yvv.contractresearch.collaborator", "value": "finance", "language": "", "element": "contractresearch", "qualifier": "collaborator", "schema": "yvv"}, {"key": "yvv.contractresearch.funding", "value": "0", "language": "", "element": "contractresearch", "qualifier": "funding", "schema": "yvv"}, {"key": "yvv.contractresearch.initiative", "value": "business", "language": "", "element": "contractresearch", "qualifier": "initiative", "schema": "yvv"}, {"key": "dc.type.coar", "value": "http://purl.org/coar/resource_type/c_bdcc", "language": null, "element": "type", "qualifier": "coar", "schema": "dc"}, {"key": "dc.rights.accesslevel", "value": "openAccess", "language": null, "element": "rights", "qualifier": "accesslevel", "schema": "dc"}, {"key": "dc.type.publication", "value": "masterThesis", "language": null, "element": "type", "qualifier": "publication", "schema": "dc"}, {"key": "dc.subject.oppiainekoodi", "value": "602", "language": "", "element": "subject", "qualifier": "oppiainekoodi", "schema": "dc"}, {"key": "dc.subject.yso", "value": "koneoppiminen", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "rahoituslaitokset", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "machine learning", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "financial institutions", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.format.content", "value": "fulltext", "language": null, "element": "format", "qualifier": "content", "schema": "dc"}, {"key": "dc.rights.url", "value": "https://rightsstatements.org/page/InC/1.0/", "language": null, "element": "rights", "qualifier": "url", "schema": "dc"}, {"key": "dc.type.okm", "value": "G2", "language": null, "element": "type", "qualifier": "okm", "schema": "dc"}]
id jyx.123456789_75073
language eng
last_indexed 2025-02-18T10:54:05Z
main_date 2021-01-01T00:00:00Z
main_date_str 2021
online_boolean 1
online_urls_str_mv {"url":"https:\/\/jyx.jyu.fi\/bitstreams\/693bd12a-3b38-46ff-ba26-9b3e1d3a942e\/download","text":"URN:NBN:fi:jyu-202104162383.pdf","source":"jyx","mediaType":"application\/pdf"}
publishDate 2021
record_format qdc
source_str_mv jyx
spellingShingle Hossain, Mohammad Farhad Study of various machine learning approaches to predict default behavior of a borrower based on transactional dataset deep learning credit scoring transaction data default behavior loan application Tietotekniikka Mathematical Information Technology 602 koneoppiminen rahoituslaitokset machine learning financial institutions
title Study of various machine learning approaches to predict default behavior of a borrower based on transactional dataset
title_full Study of various machine learning approaches to predict default behavior of a borrower based on transactional dataset
title_fullStr Study of various machine learning approaches to predict default behavior of a borrower based on transactional dataset Study of various machine learning approaches to predict default behavior of a borrower based on transactional dataset
title_full_unstemmed Study of various machine learning approaches to predict default behavior of a borrower based on transactional dataset Study of various machine learning approaches to predict default behavior of a borrower based on transactional dataset
title_short Study of various machine learning approaches to predict default behavior of a borrower based on transactional dataset
title_sort study of various machine learning approaches to predict default behavior of a borrower based on transactional dataset
title_txtP Study of various machine learning approaches to predict default behavior of a borrower based on transactional dataset
topic deep learning credit scoring transaction data default behavior loan application Tietotekniikka Mathematical Information Technology 602 koneoppiminen rahoituslaitokset machine learning financial institutions
topic_facet 602 Mathematical Information Technology Tietotekniikka credit scoring deep learning default behavior financial institutions koneoppiminen loan application machine learning rahoituslaitokset transaction data
url https://jyx.jyu.fi/handle/123456789/75073 http://www.urn.fi/URN:NBN:fi:jyu-202104162383
work_keys_str_mv AT hossainmohammadfarhad studyofvariousmachinelearningapproachestopredictdefaultbehaviorofaborrower