Real-time sentiment analysis of Twitter public stream

Sentiment analysis on Twitter public stream has been a topic of research recently. Several non-commercial libraries and software were developed to perform sentiment analysis, however none of them performed the analytics in real-time for Twitter data. Performing the same task in real-time can gives u...

Täydet tiedot

Bibliografiset tiedot
Päätekijä: Akhavan Rahnama, Amir
Muut tekijät: Informaatioteknologian tiedekunta, Faculty of Information Technology, Tietotekniikan laitos, Department of Mathematical Information Technology, University of Jyväskylä, Jyväskylän yliopisto
Aineistotyyppi: Pro gradu
Kieli:eng
Julkaistu: 2015
Aiheet:
Linkit: https://jyx.jyu.fi/handle/123456789/45352
_version_ 1826225756560162816
author Akhavan Rahnama, Amir
author2 Informaatioteknologian tiedekunta Faculty of Information Technology Tietotekniikan laitos Department of Mathematical Information Technology University of Jyväskylä Jyväskylän yliopisto
author_facet Akhavan Rahnama, Amir Informaatioteknologian tiedekunta Faculty of Information Technology Tietotekniikan laitos Department of Mathematical Information Technology University of Jyväskylä Jyväskylän yliopisto Akhavan Rahnama, Amir Informaatioteknologian tiedekunta Faculty of Information Technology Tietotekniikan laitos Department of Mathematical Information Technology University of Jyväskylä Jyväskylän yliopisto
author_sort Akhavan Rahnama, Amir
datasource_str_mv jyx
description Sentiment analysis on Twitter public stream has been a topic of research recently. Several non-commercial libraries and software were developed to perform sentiment analysis, however none of them performed the analytics in real-time for Twitter data. Performing the same task in real-time can gives us insight of Twitter users public opinions regarding recent happenings of the time that analysis was made. In this thesis work, we propose a full-stack architecture with a software prototype that performs real- time sentiment analysis on Twitter public stream. We address the problem using large- scale online learning and specifically online parallel decision trees. Large-scale learning is utilized due to the fact that social media website such as Twitter produce data with high volume (around 5800 tweets per second in 2014) and in addition, there is a high time constraint (up to seconds) in real-time analytics in both learning, processing and query response time. Moreover, Twitter stream data arrives instance-by-instance and therefore we have utilized online learning with incremental and per-instance learning flexibility. SAMOA is a framework that provides support for a set of scalable online learning algorithms such as Vertical Hoeffding Tree. We use SAMOA’s VHT learner with Apache Storm as our Stream Processing Engine. However, utilizing only VHT and Apache Storm cannot solve the problem at hand. Therefore, we also developed an open- source Java library called Sentinel that enables real-time Twitter stream reading, in- memory pre-processing computations and data structures, feature selection, frequent miner algorithms and etc. that completes our architecture. In Chapter 3, we show the architecture of our solution and its applicability and usefulness is shown in chapter 4.
first_indexed 2024-09-11T08:51:45Z
format Pro gradu
free_online_boolean 1
fullrecord [{"key": "dc.contributor.advisor", "value": "Veijalainen, Jari", "language": "", "element": "contributor", "qualifier": "advisor", "schema": "dc"}, {"key": "dc.contributor.author", "value": "Akhavan Rahnama, Amir", "language": null, "element": "contributor", "qualifier": "author", "schema": "dc"}, {"key": "dc.date.accessioned", "value": "2015-02-18T08:55:59Z", "language": "", "element": "date", "qualifier": "accessioned", "schema": "dc"}, {"key": "dc.date.available", "value": "2015-02-18T08:55:59Z", "language": "", "element": "date", "qualifier": "available", "schema": "dc"}, {"key": "dc.date.issued", "value": "2015", "language": null, "element": "date", "qualifier": "issued", "schema": "dc"}, {"key": "dc.identifier.other", "value": "oai:jykdok.linneanet.fi:1466490", "language": null, "element": "identifier", "qualifier": "other", "schema": "dc"}, {"key": "dc.identifier.uri", "value": "https://jyx.jyu.fi/handle/123456789/45352", "language": "", "element": "identifier", "qualifier": "uri", "schema": "dc"}, {"key": "dc.description.abstract", "value": "Sentiment analysis on Twitter public stream has been a topic of research recently. Several non-commercial libraries and software were developed to perform sentiment analysis, however none of them performed the analytics in real-time for Twitter data. Performing the same task in real-time can gives us insight of Twitter users public opinions regarding recent happenings of the time that analysis was made. In this thesis work, we propose a full-stack architecture with a software prototype that performs real- time sentiment analysis on Twitter public stream. We address the problem using large- scale online learning and specifically online parallel decision trees. Large-scale learning is utilized due to the fact that social media website such as Twitter produce data with high volume (around 5800 tweets per second in 2014) and in addition, there is a high time constraint (up to seconds) in real-time analytics in both learning, processing and query response time. Moreover, Twitter stream data arrives instance-by-instance and therefore we have utilized online learning with incremental and per-instance learning flexibility. SAMOA is a framework that provides support for a set of scalable online learning algorithms such as Vertical Hoeffding Tree. We use SAMOA\u2019s VHT learner with Apache Storm as our Stream Processing Engine. However, utilizing only VHT and Apache Storm cannot solve the problem at hand. Therefore, we also developed an open- source Java library called Sentinel that enables real-time Twitter stream reading, in- memory pre-processing computations and data structures, feature selection, frequent miner algorithms and etc. that completes our architecture. In Chapter 3, we show the architecture of our solution and its applicability and usefulness is shown in chapter 4.", "language": "en", "element": "description", "qualifier": "abstract", "schema": "dc"}, {"key": "dc.description.provenance", "value": "Submitted using Plone Publishing form by Amir AkhavanRahnama (amhoakha) on 2015-02-18 08:55:56.674790. Form: Master's Thesis publishing form (https://kirjasto.jyu.fi/publish-and-buy/publishing-forms/masters-thesis-publishing-form). JyX data: [jyx_publishing-allowed (fi) =True]", "language": "en", "element": "description", "qualifier": "provenance", "schema": "dc"}, {"key": "dc.description.provenance", "value": "Submitted by jyx lomake-julkaisija (jyx-julkaisija@noreply.fi) on 2015-02-18T08:55:59Z\r\nNo. of bitstreams: 2\r\nURN:NBN:fi:jyu-201502181337.pdf: 6865766 bytes, checksum: 17292900563d81268d594618ad81eff4 (MD5)\r\nlicense.html: 4283 bytes, checksum: 287da11561cbdc3f12c93d24c6608f40 (MD5)", "language": "en", "element": "description", "qualifier": "provenance", "schema": "dc"}, {"key": "dc.description.provenance", "value": "Made available in DSpace on 2015-02-18T08:55:59Z (GMT). No. of bitstreams: 2\r\nURN:NBN:fi:jyu-201502181337.pdf: 6865766 bytes, checksum: 17292900563d81268d594618ad81eff4 (MD5)\r\nlicense.html: 4283 bytes, checksum: 287da11561cbdc3f12c93d24c6608f40 (MD5)\r\n Previous issue date: 2015", "language": "en", "element": "description", "qualifier": "provenance", "schema": "dc"}, {"key": "dc.format.extent", "value": "1 verkkoaineisto (62 sivua)", "language": null, "element": "format", "qualifier": "extent", "schema": "dc"}, {"key": "dc.format.mimetype", "value": "application/pdf", "language": null, "element": "format", "qualifier": "mimetype", "schema": "dc"}, {"key": "dc.language.iso", "value": "eng", "language": null, "element": "language", "qualifier": "iso", "schema": "dc"}, {"key": "dc.rights", "value": "In Copyright", "language": "en", "element": "rights", "qualifier": null, "schema": "dc"}, {"key": "dc.subject.other", "value": "sentiment analysis", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.subject.other", "value": "real-time analytics", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.subject.other", "value": "social media mining", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.subject.other", "value": "twitter", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.subject.other", "value": "large- scale learning", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.subject.other", "value": "parallel decision tree", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.title", "value": "Real-time sentiment analysis of Twitter public stream", "language": null, "element": "title", "qualifier": null, "schema": "dc"}, {"key": "dc.type", "value": "master thesis", "language": null, "element": "type", "qualifier": null, "schema": "dc"}, {"key": "dc.identifier.urn", "value": "URN:NBN:fi:jyu-201502181337", "language": null, "element": "identifier", "qualifier": "urn", "schema": "dc"}, {"key": "dc.type.ontasot", "value": "Pro gradu -tutkielma", "language": "fi", "element": "type", "qualifier": "ontasot", "schema": "dc"}, {"key": "dc.type.ontasot", "value": "Master\u2019s thesis", "language": "en", "element": "type", "qualifier": "ontasot", "schema": "dc"}, {"key": "dc.contributor.faculty", "value": "Informaatioteknologian tiedekunta", "language": "fi", "element": "contributor", "qualifier": "faculty", "schema": "dc"}, {"key": "dc.contributor.faculty", "value": "Faculty of Information Technology", "language": "en", "element": "contributor", "qualifier": "faculty", "schema": "dc"}, {"key": "dc.contributor.department", "value": "Tietotekniikan laitos", "language": "fi", "element": "contributor", "qualifier": "department", "schema": "dc"}, {"key": "dc.contributor.department", "value": "Department of Mathematical Information Technology", "language": "en", "element": "contributor", "qualifier": "department", "schema": "dc"}, {"key": "dc.contributor.organization", "value": "University of Jyv\u00e4skyl\u00e4", "language": "en", "element": "contributor", "qualifier": "organization", "schema": "dc"}, {"key": "dc.contributor.organization", "value": "Jyv\u00e4skyl\u00e4n yliopisto", "language": "fi", "element": "contributor", "qualifier": "organization", "schema": "dc"}, {"key": "dc.subject.discipline", "value": "Tietotekniikka", "language": "fi", "element": "subject", "qualifier": "discipline", "schema": "dc"}, {"key": "dc.subject.discipline", "value": "Mathematical Information Technology", "language": "en", "element": "subject", "qualifier": "discipline", "schema": "dc"}, {"key": "dc.date.updated", "value": "2015-02-18T08:56:00Z", "language": "", "element": "date", "qualifier": "updated", "schema": "dc"}, {"key": "yvv.contractresearch.collaborator", "value": "finance", "language": "", "element": "contractresearch", "qualifier": "collaborator", "schema": "yvv"}, {"key": "yvv.contractresearch.funding", "value": "0", "language": "", "element": "contractresearch", "qualifier": "funding", "schema": "yvv"}, {"key": "yvv.contractresearch.initiative", "value": "student", "language": "", "element": "contractresearch", "qualifier": "initiative", "schema": "yvv"}, {"key": "dc.type.coar", "value": "http://purl.org/coar/resource_type/c_bdcc", "language": null, "element": "type", "qualifier": "coar", "schema": "dc"}, {"key": "dc.rights.accesslevel", "value": "openAccess", "language": null, "element": "rights", "qualifier": "accesslevel", "schema": "dc"}, {"key": "dc.type.publication", "value": "masterThesis", "language": null, "element": "type", "qualifier": "publication", "schema": "dc"}, {"key": "dc.subject.oppiainekoodi", "value": "602", "language": null, "element": "subject", "qualifier": "oppiainekoodi", "schema": "dc"}, {"key": "dc.subject.yso", "value": "Twitter", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "sosiaalinen media", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "tiedonlouhinta", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.format.content", "value": "fulltext", "language": null, "element": "format", "qualifier": "content", "schema": "dc"}, {"key": "dc.rights.url", "value": "https://rightsstatements.org/page/InC/1.0/", "language": null, "element": "rights", "qualifier": "url", "schema": "dc"}, {"key": "dc.type.okm", "value": "G2", "language": null, "element": "type", "qualifier": "okm", "schema": "dc"}]
id jyx.123456789_45352
language eng
last_indexed 2025-02-18T10:56:41Z
main_date 2015-01-01T00:00:00Z
main_date_str 2015
online_boolean 1
online_urls_str_mv {"url":"https:\/\/jyx.jyu.fi\/bitstreams\/e151ce73-8016-479b-a35f-157307fa5f91\/download","text":"URN:NBN:fi:jyu-201502181337.pdf","source":"jyx","mediaType":"application\/pdf"}
publishDate 2015
record_format qdc
source_str_mv jyx
spellingShingle Akhavan Rahnama, Amir Real-time sentiment analysis of Twitter public stream sentiment analysis real-time analytics social media mining twitter large- scale learning parallel decision tree Tietotekniikka Mathematical Information Technology 602 Twitter sosiaalinen media tiedonlouhinta
title Real-time sentiment analysis of Twitter public stream
title_full Real-time sentiment analysis of Twitter public stream
title_fullStr Real-time sentiment analysis of Twitter public stream Real-time sentiment analysis of Twitter public stream
title_full_unstemmed Real-time sentiment analysis of Twitter public stream Real-time sentiment analysis of Twitter public stream
title_short Real-time sentiment analysis of Twitter public stream
title_sort real time sentiment analysis of twitter public stream
title_txtP Real-time sentiment analysis of Twitter public stream
topic sentiment analysis real-time analytics social media mining twitter large- scale learning parallel decision tree Tietotekniikka Mathematical Information Technology 602 Twitter sosiaalinen media tiedonlouhinta
topic_facet 602 Mathematical Information Technology Tietotekniikka Twitter large- scale learning parallel decision tree real-time analytics sentiment analysis social media mining sosiaalinen media tiedonlouhinta twitter
url https://jyx.jyu.fi/handle/123456789/45352 http://www.urn.fi/URN:NBN:fi:jyu-201502181337
work_keys_str_mv AT akhavanrahnamaamir realtimesentimentanalysisoftwitterpublicstream