Data Privacy in the age of LLM-based services in Education: Current Challenges, Improvement Guidelines and Future Directions

The research objective of this Master's Thesis is to clarify what kind of privacy and data protection challenges and development practices for improving them are seen now and in the future while generative AI is utilized in the education sector in Finland. Based on the earlier research and stud...

Täydet tiedot

Bibliografiset tiedot
Päätekijä:	Piri, Christina
Muut tekijät:	Faculty of Information Technology, Informaatioteknologian tiedekunta, University of Jyväskylä, Jyväskylän yliopisto
Aineistotyyppi:	Pro gradu
Kieli:	eng
Julkaistu:	2024
Aiheet:	Master's Degree Programme in Information Systems Tietojärjestelmätieteen maisteriohjelma
Linkit:	https://jyx.jyu.fi/handle/123456789/97306

_version_	1833407622374490112
author	Piri, Christina
author2	Faculty of Information Technology Informaatioteknologian tiedekunta University of Jyväskylä Jyväskylän yliopisto
author_facet	Piri, Christina Faculty of Information Technology Informaatioteknologian tiedekunta University of Jyväskylä Jyväskylän yliopisto Piri, Christina Faculty of Information Technology Informaatioteknologian tiedekunta University of Jyväskylä Jyväskylän yliopisto
author_sort	Piri, Christina
datasource_str_mv	jyx
description	The research objective of this Master's Thesis is to clarify what kind of privacy and data protection challenges and development practices for improving them are seen now and in the future while generative AI is utilized in the education sector in Finland. Based on the earlier research and studies alongside this study's interview data, a growing concern exists about how much sensitive personal information LLM-based applications and services collect and for what purposes these data are eventually used. It also remains to be seen to what extent the current legislation can address the issues concerning collecting and processing personal data in the context of rapidly developing AI technology. This thesis aims to answer the research question: What guidelines and practices exist for enhancing individuals' privacy and data protection as using LLM-based applications becomes more common in the educational sector? Alongside the results from earlier research literature, the empirical research data was collected through semi-structured interviews utilizing qualitative content analysis as a research theory in this study. Based on the results of earlier studies, several themes were recognized that supported the results of the interviews. In addition, new themes were brought up from the interview data. Concerns related to sufficient data protection in the context of generative AI are realistic. The results of this study offer practices and guidelines to improve individuals' privacy and data protection in the educational sector. It is necessary to highlight the importance of continuous education for students and educators and implement practices and guidelines to enhance the responsible use of generative AI. AI developer organizations may focus on safeguarding users' personal data throughout service development, starting from designing and developing their services to comply with data protection legislation. Since generative AI will keep developing, its impacts on data privacy and protection will also be significant in the future. Therefore, the development of data protection regulation may be essential to tackle the privacy challenges AI poses. Keywords: data privacy, data protection, artificial intelligence (AI), generative AI in education, large language models (LLMs), ChatGPT, Microsoft Copilot Tämän pro gradu -tutkielman tutkimustavoitteena oli selvittää, millaisia tietosuojariskejä ja niihin liittyviä kehitysmahdollisuuksia nähdään nyt ja tulevaisuudessa, kun generatiivista tekoälyä hyödynnetään kasvavissa määrin koulutussektorilla Suomessa. Aiempien tutkimustulosten ja tämän tutkimuksen haastatteluaineiston perusteella herää huoli siitä, kuinka paljon eri arkaluonteista henkilötietoa laajoihin kielimalleihin (LLM) pohjautuvat sovellukset ja palvelut keräävät ja mihin tarkoituksiin näitä tietoja lopulta käytetään. Lisäksi on epäselvää, missä määrin nykyinen lainsäädäntö pystyy vastaamaan henkilötietojen keräämiseen ja käsittelyyn liittyviin haasteisiin generatiivisen tekoälyn kontekstissa. Tämä tutkielma pyrkii vastaamaan seuraavaan tutkimuskysymykseen: mitkä ovat ne ohjeistukset ja käytännöt käyttäjien yksityisyyden ja tietosuojan parantamiseksi, kun laajoihin kielimalleihin pohjautuvien sovellusten ja palveluiden käyttö koulutussektorilla yleistyy tulevaisuudessa? Empiirinen tutkimusaineisto kerättiin puolistrukturoiduilla haastatteluilla, hyödyntäen tutkimusmenetelmänä laadullista sisällönanalyysiä. Aiempien tutkimusten ja niiden tulosten pohjalta tunnistettiin teemoja, jotka tukivat haastattelujen tuloksia. Näiden lisäksi haastatteluaineistosta nousi esiin uusia teemoja. Yhteenvetona voidaan todeta, että huolenaiheet käyttäjien riittävästä yksityisyyden suojasta generatiivisen tekoälyn kontekstissa on realistinen. Ratkaisuna tähän, tämän tutkimuksen tulokset tarjoavat käytäntöjä ja ohjeita henkilöiden yksityisyyden ja tietosuojan parantamiseen koulutussektorilla. Opiskelijoiden ja opetushenkilökunnan jatkuva koulutus sekä päivitettyjen ohjeiden ja käytäntöjen jalkauttaminen osaltaan edistävät tekoälyn vastuullista käyttöä. Tekoälyn kehittäjäorganisaatioiden tulisi vastata käyttäjien henkilötietojen suojaamisesta koko kehitysprosessin ajan alkaen siitä, että palvelun suunnittelu ja kehitys toteutetaan tietosuojalainsäädännön mukaisesti. Tekoälyn laajentuessa ja kehittyessä sen vaikutukset henkilötietosuojaan ovat jatkossakin merkittäviä, joten tietosuojasääntelyn kehitys voi olla olennaista, jotta voidaan vastata tekoälyn tuomiin tietosuoja haasteisiin. Avainsanat: tietosuoja, henkilötietojen suoja, tekoäly, generatiivinen tekoäly koulutuksessa, suuret kielimallit (LLM), ChatGPT, Microsoft Copilot
first_indexed	2024-10-01T20:00:30Z
format	Pro gradu
free_online_boolean	1
fullrecord	[{"key": "dc.contributor.advisor", "value": "Sepp\u00e4nen, Ville", "language": null, "element": "contributor", "qualifier": "advisor", "schema": "dc"}, {"key": "dc.contributor.author", "value": "Piri, Christina", "language": null, "element": "contributor", "qualifier": "author", "schema": "dc"}, {"key": "dc.date.accessioned", "value": "2024-09-30T13:20:09Z", "language": null, "element": "date", "qualifier": "accessioned", "schema": "dc"}, {"key": "dc.date.available", "value": "2024-09-30T13:20:09Z", "language": null, "element": "date", "qualifier": "available", "schema": "dc"}, {"key": "dc.date.issued", "value": "2024", "language": null, "element": "date", "qualifier": "issued", "schema": "dc"}, {"key": "dc.identifier.uri", "value": "https://jyx.jyu.fi/handle/123456789/97306", "language": null, "element": "identifier", "qualifier": "uri", "schema": "dc"}, {"key": "dc.description.abstract", "value": "The research objective of this Master's Thesis is to clarify what kind of privacy and data protection challenges and development practices for improving them are seen now and in the future while generative AI is utilized in the education sector in Finland. Based on the earlier research and studies alongside this study's interview data, a growing concern exists about how much sensitive personal information LLM-based applications and services collect and for what purposes these data are eventually used. It also remains to be seen to what extent the current legislation can address the issues concerning collecting and processing personal data in the context of rapidly developing AI technology. This thesis aims to answer the research question: What guidelines and practices exist for enhancing individuals' privacy and data protection as using LLM-based applications becomes more common in the educational sector? Alongside the results from earlier research literature, the empirical research data was collected through semi-structured interviews utilizing qualitative content analysis as a research theory in this study. Based on the results of earlier studies, several themes were recognized that supported the results of the interviews. In addition, new themes were brought up from the interview data. Concerns related to sufficient data protection in the context of generative AI are realistic. The results of this study offer practices and guidelines to improve individuals' privacy and data protection in the educational sector. It is necessary to highlight the importance of continuous education for students and educators and implement practices and guidelines to enhance the responsible use of generative AI. AI developer organizations may focus on safeguarding users' personal data throughout service development, starting from designing and developing their services to comply with data protection legislation. Since generative AI will keep developing, its impacts on data privacy and protection will also be significant in the future. Therefore, the development of data protection regulation may be essential to tackle the privacy challenges AI poses.\n\nKeywords: data privacy, data protection, artificial intelligence (AI), generative AI in education, large language models (LLMs), ChatGPT, Microsoft Copilot", "language": "en", "element": "description", "qualifier": "abstract", "schema": "dc"}, {"key": "dc.description.abstract", "value": "T\u00e4m\u00e4n pro gradu -tutkielman tutkimustavoitteena oli selvitt\u00e4\u00e4, millaisia tietosuojariskej\u00e4 ja niihin liittyvi\u00e4 kehitysmahdollisuuksia n\u00e4hd\u00e4\u00e4n nyt ja tulevaisuudessa, kun generatiivista teko\u00e4ly\u00e4 hy\u00f6dynnet\u00e4\u00e4n kasvavissa m\u00e4\u00e4rin koulutussektorilla Suomessa. Aiempien tutkimustulosten ja t\u00e4m\u00e4n tutkimuksen haastatteluaineiston perusteella her\u00e4\u00e4 huoli siit\u00e4, kuinka paljon eri arkaluonteista henkil\u00f6tietoa laajoihin kielimalleihin (LLM) pohjautuvat sovellukset ja palvelut ker\u00e4\u00e4v\u00e4t ja mihin tarkoituksiin n\u00e4it\u00e4 tietoja lopulta k\u00e4ytet\u00e4\u00e4n. Lis\u00e4ksi on ep\u00e4selv\u00e4\u00e4, miss\u00e4 m\u00e4\u00e4rin nykyinen lains\u00e4\u00e4d\u00e4nt\u00f6 pystyy vastaamaan henkil\u00f6tietojen ker\u00e4\u00e4miseen ja k\u00e4sittelyyn liittyviin haasteisiin generatiivisen teko\u00e4lyn kontekstissa. T\u00e4m\u00e4 tutkielma pyrkii vastaamaan seuraavaan tutkimuskysymykseen: mitk\u00e4 ovat ne ohjeistukset ja k\u00e4yt\u00e4nn\u00f6t k\u00e4ytt\u00e4jien yksityisyyden ja tietosuojan parantamiseksi, kun laajoihin kielimalleihin pohjautuvien sovellusten ja palveluiden k\u00e4ytt\u00f6 koulutussektorilla yleistyy tulevaisuudessa? Empiirinen tutkimusaineisto ker\u00e4ttiin puolistrukturoiduilla haastatteluilla, hy\u00f6dynt\u00e4en tutkimusmenetelm\u00e4n\u00e4 laadullista sis\u00e4ll\u00f6nanalyysi\u00e4. Aiempien tutkimusten ja niiden tulosten pohjalta tunnistettiin teemoja, jotka tukivat haastattelujen tuloksia. N\u00e4iden lis\u00e4ksi haastatteluaineistosta nousi esiin uusia teemoja. Yhteenvetona voidaan todeta, ett\u00e4 huolenaiheet k\u00e4ytt\u00e4jien riitt\u00e4v\u00e4st\u00e4 yksityisyyden suojasta generatiivisen teko\u00e4lyn kontekstissa on realistinen. Ratkaisuna t\u00e4h\u00e4n, t\u00e4m\u00e4n tutkimuksen tulokset tarjoavat k\u00e4yt\u00e4nt\u00f6j\u00e4 ja ohjeita henkil\u00f6iden yksityisyyden ja tietosuojan parantamiseen koulutussektorilla. Opiskelijoiden ja opetushenkil\u00f6kunnan jatkuva koulutus sek\u00e4 p\u00e4ivitettyjen ohjeiden ja k\u00e4yt\u00e4nt\u00f6jen jalkauttaminen osaltaan edist\u00e4v\u00e4t teko\u00e4lyn vastuullista k\u00e4ytt\u00f6\u00e4. Teko\u00e4lyn kehitt\u00e4j\u00e4organisaatioiden tulisi vastata k\u00e4ytt\u00e4jien henkil\u00f6tietojen suojaamisesta koko kehitysprosessin ajan alkaen siit\u00e4, ett\u00e4 palvelun suunnittelu ja kehitys toteutetaan tietosuojalains\u00e4\u00e4d\u00e4nn\u00f6n mukaisesti. Teko\u00e4lyn laajentuessa ja kehittyess\u00e4 sen vaikutukset henkil\u00f6tietosuojaan ovat jatkossakin merkitt\u00e4vi\u00e4, joten tietosuojas\u00e4\u00e4ntelyn kehitys voi olla olennaista, jotta voidaan vastata teko\u00e4lyn tuomiin tietosuoja haasteisiin.\n\nAvainsanat: tietosuoja, henkil\u00f6tietojen suoja, teko\u00e4ly, generatiivinen teko\u00e4ly koulutuksessa, suuret kielimallit (LLM), ChatGPT, Microsoft Copilot", "language": "fi", "element": "description", "qualifier": "abstract", "schema": "dc"}, {"key": "dc.description.provenance", "value": "Submitted by jyx lomake-julkaisija (jyx-julkaisija.group@korppi.jyu.fi) on 2024-09-30T13:20:09Z\nNo. of bitstreams: 0", "language": "en", "element": "description", "qualifier": "provenance", "schema": "dc"}, {"key": "dc.description.provenance", "value": "Made available in DSpace on 2024-09-30T13:20:09Z (GMT). No. of bitstreams: 0", "language": "en", "element": "description", "qualifier": "provenance", "schema": "dc"}, {"key": "dc.format.extent", "value": "72", "language": null, "element": "format", "qualifier": "extent", "schema": "dc"}, {"key": "dc.format.mimetype", "value": "application/pdf", "language": null, "element": "format", "qualifier": "mimetype", "schema": "dc"}, {"key": "dc.language.iso", "value": "eng", "language": null, "element": "language", "qualifier": "iso", "schema": "dc"}, {"key": "dc.rights", "value": "CC BY-NC-ND 4.0", "language": "en", "element": "rights", "qualifier": null, "schema": "dc"}, {"key": "dc.title", "value": "Data Privacy in the age of LLM-based services in Education: Current Challenges, Improvement Guidelines and Future Directions", "language": null, "element": "title", "qualifier": null, "schema": "dc"}, {"key": "dc.type", "value": "master thesis", "language": null, "element": "type", "qualifier": null, "schema": "dc"}, {"key": "dc.identifier.urn", "value": "URN:NBN:fi:jyu-202409306177", "language": null, "element": "identifier", "qualifier": "urn", "schema": "dc"}, {"key": "dc.contributor.faculty", "value": "Faculty of Information Technology", "language": "en", "element": "contributor", "qualifier": "faculty", "schema": "dc"}, {"key": "dc.contributor.faculty", "value": "Informaatioteknologian tiedekunta", "language": "fi", "element": "contributor", "qualifier": "faculty", "schema": "dc"}, {"key": "dc.contributor.organization", "value": "University of Jyv\u00e4skyl\u00e4", "language": "en", "element": "contributor", "qualifier": "organization", "schema": "dc"}, {"key": "dc.contributor.organization", "value": "Jyv\u00e4skyl\u00e4n yliopisto", "language": "fi", "element": "contributor", "qualifier": "organization", "schema": "dc"}, {"key": "dc.subject.discipline", "value": "Master's Degree Programme in Information Systems", "language": "en", "element": "subject", "qualifier": "discipline", "schema": "dc"}, {"key": "dc.subject.discipline", "value": "Tietoj\u00e4rjestelm\u00e4tieteen maisteriohjelma", "language": "fi", "element": "subject", "qualifier": "discipline", "schema": "dc"}, {"key": "dc.type.coar", "value": "http://purl.org/coar/resource_type/c_bdcc", "language": null, "element": "type", "qualifier": "coar", "schema": "dc"}, {"key": "dc.rights.copyright", "value": "\u00a9 The Author(s)", "language": null, "element": "rights", "qualifier": "copyright", "schema": "dc"}, {"key": "dc.rights.accesslevel", "value": "openAccess", "language": null, "element": "rights", "qualifier": "accesslevel", "schema": "dc"}, {"key": "dc.type.publication", "value": "masterThesis", "language": null, "element": "type", "qualifier": "publication", "schema": "dc"}, {"key": "dc.format.content", "value": "fulltext", "language": null, "element": "format", "qualifier": "content", "schema": "dc"}, {"key": "dc.rights.url", "value": "https://creativecommons.org/licenses/by-nc-nd/4.0/", "language": null, "element": "rights", "qualifier": "url", "schema": "dc"}, {"key": "dc.description.accessibilityfeature", "value": "unknown accessibility", "language": "en", "element": "description", "qualifier": "accessibilityfeature", "schema": "dc"}, {"key": "dc.description.accessibilityfeature", "value": "ei tietoa saavutettavuudesta", "language": "fi", "element": "description", "qualifier": "accessibilityfeature", "schema": "dc"}]
id	jyx.123456789_97306
language	eng
last_indexed	2025-05-21T20:06:40Z
main_date	2024-01-01T00:00:00Z
main_date_str	2024
online_boolean	1
online_urls_str_mv	{"url":"https:\/\/jyx.jyu.fi\/bitstreams\/c252de19-7158-469a-a5ca-03d486ff65da\/download","text":"URN:NBN:fi:jyu-202409306177.pdf","source":"jyx","mediaType":"application\/pdf"}
publishDate	2024
record_format	qdc
source_str_mv	jyx
spellingShingle	Piri, Christina Data Privacy in the age of LLM-based services in Education: Current Challenges, Improvement Guidelines and Future Directions Master's Degree Programme in Information Systems Tietojärjestelmätieteen maisteriohjelma
title	Data Privacy in the age of LLM-based services in Education: Current Challenges, Improvement Guidelines and Future Directions
title_full	Data Privacy in the age of LLM-based services in Education: Current Challenges, Improvement Guidelines and Future Directions
title_fullStr	Data Privacy in the age of LLM-based services in Education: Current Challenges, Improvement Guidelines and Future Directions Data Privacy in the age of LLM-based services in Education: Current Challenges, Improvement Guidelines and Future Directions
title_full_unstemmed	Data Privacy in the age of LLM-based services in Education: Current Challenges, Improvement Guidelines and Future Directions Data Privacy in the age of LLM-based services in Education: Current Challenges, Improvement Guidelines and Future Directions
title_short	Data Privacy in the age of LLM-based services in Education: Current Challenges, Improvement Guidelines and Future Directions
title_sort	data privacy in the age of llm based services in education current challenges improvement guidelines and future directions
title_txtP	Data Privacy in the age of LLM-based services in Education: Current Challenges, Improvement Guidelines and Future Directions
topic	Master's Degree Programme in Information Systems Tietojärjestelmätieteen maisteriohjelma
topic_facet	Master's Degree Programme in Information Systems Tietojärjestelmätieteen maisteriohjelma
url	https://jyx.jyu.fi/handle/123456789/97306 http://www.urn.fi/URN:NBN:fi:jyu-202409306177
work_keys_str_mv	AT pirichristina dataprivacyintheageofllmbasedservicesineducationcurrentchallengesimprovementguidel

Data Privacy in the age of LLM-based services in Education: Current Challenges, Improvement Guidelines and Future Directions

Samankaltaisia teoksia