Emotional speech from machine

Emotional speech is the expressiveness in speech that is transmitted through changes in pitch, loudness, timbre, speech rate and pauses that convey emotion. Although the current TTS technology is capable of converting a given text into speech, they sound monotonous and lack emotion and naturalness....

Full description

Bibliographic Details
Main Author: Amatya, Bipika
Other Authors: Informaatioteknologian tiedekunta, Faculty of Information Technology, Informaatioteknologia, Information Technology, Jyväskylän yliopisto, University of Jyväskylä
Format: Master's thesis
Language:eng
Published: 2020
Subjects:
Online Access: https://jyx.jyu.fi/handle/123456789/69368
_version_ 1826225719774019584
author Amatya, Bipika
author2 Informaatioteknologian tiedekunta Faculty of Information Technology Informaatioteknologia Information Technology Jyväskylän yliopisto University of Jyväskylä
author_facet Amatya, Bipika Informaatioteknologian tiedekunta Faculty of Information Technology Informaatioteknologia Information Technology Jyväskylän yliopisto University of Jyväskylä Amatya, Bipika Informaatioteknologian tiedekunta Faculty of Information Technology Informaatioteknologia Information Technology Jyväskylän yliopisto University of Jyväskylä
author_sort Amatya, Bipika
datasource_str_mv jyx
description Emotional speech is the expressiveness in speech that is transmitted through changes in pitch, loudness, timbre, speech rate and pauses that convey emotion. Although the current TTS technology is capable of converting a given text into speech, they sound monotonous and lack emotion and naturalness. In order to improve artificial voices, application of emotion is highly evaluated. In this thesis, we will be creating a system that makes use of speech mark-up language to produce emotion in speech by analysing the tone of given text. For this purpose, we combine IBM tone analyser with TTS that accepts the speech mark-up language. In this research, we perform empirical study on two experimental implementation using two TTS and two speech mark-up language. The first combination involves IBM TTS and SSML and the second combination includes MARY TTS and EmotionML. The mark-ups are predefined in EmotionML for four major emotions namely anger, fear, joy and sadness and for SSML prosody value from previous study is used. Therefore, this study describes the two implementations and evaluate their output emotional speech synthesis which is then compares with human voice to define its perfection.
first_indexed 2020-06-02T20:05:04Z
format Pro gradu
free_online_boolean 1
fullrecord [{"key": "dc.contributor.author", "value": "Amatya, Bipika", "language": "", "element": "contributor", "qualifier": "author", "schema": "dc"}, {"key": "dc.date.accessioned", "value": "2020-06-02T05:43:58Z", "language": null, "element": "date", "qualifier": "accessioned", "schema": "dc"}, {"key": "dc.date.available", "value": "2020-06-02T05:43:58Z", "language": null, "element": "date", "qualifier": "available", "schema": "dc"}, {"key": "dc.date.issued", "value": "2020", "language": "", "element": "date", "qualifier": "issued", "schema": "dc"}, {"key": "dc.identifier.uri", "value": "https://jyx.jyu.fi/handle/123456789/69368", "language": null, "element": "identifier", "qualifier": "uri", "schema": "dc"}, {"key": "dc.description.abstract", "value": "Emotional speech is the expressiveness in speech that is transmitted through changes in pitch, loudness, timbre, speech rate and pauses that convey emotion. Although the current TTS technology is capable of converting a given text into speech, they sound monotonous and lack emotion and naturalness. In order to improve artificial voices, application of emotion is highly evaluated. In this thesis, we will be creating a system that makes use of speech mark-up language to produce emotion in speech by analysing the tone of given text. For this purpose, we combine IBM tone analyser with TTS that accepts the speech mark-up language. In this research, we perform empirical study on two experimental implementation using two TTS and two speech mark-up language. The first combination involves IBM TTS and SSML and the second combination includes MARY TTS and EmotionML. The mark-ups are predefined in EmotionML for four major emotions namely anger, fear, joy and sadness and for SSML prosody value from previous study is used. Therefore, this study describes the two implementations and evaluate their output emotional speech synthesis which is then compares with human voice to define its perfection.", "language": "en", "element": "description", "qualifier": "abstract", "schema": "dc"}, {"key": "dc.description.provenance", "value": "Submitted by Paivi Vuorio (paelvuor@jyu.fi) on 2020-06-02T05:43:58Z\nNo. of bitstreams: 0", "language": "en", "element": "description", "qualifier": "provenance", "schema": "dc"}, {"key": "dc.description.provenance", "value": "Made available in DSpace on 2020-06-02T05:43:58Z (GMT). No. of bitstreams: 0\n Previous issue date: 2020", "language": "en", "element": "description", "qualifier": "provenance", "schema": "dc"}, {"key": "dc.format.extent", "value": "50", "language": "", "element": "format", "qualifier": "extent", "schema": "dc"}, {"key": "dc.format.mimetype", "value": "application/pdf", "language": null, "element": "format", "qualifier": "mimetype", "schema": "dc"}, {"key": "dc.language.iso", "value": "eng", "language": null, "element": "language", "qualifier": "iso", "schema": "dc"}, {"key": "dc.rights", "value": "In Copyright", "language": "en", "element": "rights", "qualifier": null, "schema": "dc"}, {"key": "dc.subject.other", "value": "Emotional Text to speech", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.subject.other", "value": "IBM Watson Tone Analyser", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.subject.other", "value": "emotional speech", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.subject.other", "value": "EmotionML", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.subject.other", "value": "SSML", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.subject.other", "value": "MARY", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.subject.other", "value": "IBM Watson Text to Speech", "language": "", "element": "subject", "qualifier": "other", "schema": "dc"}, {"key": "dc.title", "value": "Emotional speech from machine", "language": "", "element": "title", "qualifier": null, "schema": "dc"}, {"key": "dc.type", "value": "master thesis", "language": null, "element": "type", "qualifier": null, "schema": "dc"}, {"key": "dc.identifier.urn", "value": "URN:NBN:fi:jyu-202006023626", "language": "", "element": "identifier", "qualifier": "urn", "schema": "dc"}, {"key": "dc.type.ontasot", "value": "Pro gradu -tutkielma", "language": "fi", "element": "type", "qualifier": "ontasot", "schema": "dc"}, {"key": "dc.type.ontasot", "value": "Master\u2019s thesis", "language": "en", "element": "type", "qualifier": "ontasot", "schema": "dc"}, {"key": "dc.contributor.faculty", "value": "Informaatioteknologian tiedekunta", "language": "fi", "element": "contributor", "qualifier": "faculty", "schema": "dc"}, {"key": "dc.contributor.faculty", "value": "Faculty of Information Technology", "language": "en", "element": "contributor", "qualifier": "faculty", "schema": "dc"}, {"key": "dc.contributor.department", "value": "Informaatioteknologia", "language": "fi", "element": "contributor", "qualifier": "department", "schema": "dc"}, {"key": "dc.contributor.department", "value": "Information Technology", "language": "en", "element": "contributor", "qualifier": "department", "schema": "dc"}, {"key": "dc.contributor.organization", "value": "Jyv\u00e4skyl\u00e4n yliopisto", "language": "fi", "element": "contributor", "qualifier": "organization", "schema": "dc"}, {"key": "dc.contributor.organization", "value": "University of Jyv\u00e4skyl\u00e4", "language": "en", "element": "contributor", "qualifier": "organization", "schema": "dc"}, {"key": "dc.subject.discipline", "value": "Tietotekniikka", "language": "fi", "element": "subject", "qualifier": "discipline", "schema": "dc"}, {"key": "dc.subject.discipline", "value": "Mathematical Information Technology", "language": "en", "element": "subject", "qualifier": "discipline", "schema": "dc"}, {"key": "yvv.contractresearch.collaborator", "value": "business", "language": "", "element": "contractresearch", "qualifier": "collaborator", "schema": "yvv"}, {"key": "yvv.contractresearch.funding", "value": "0", "language": "", "element": "contractresearch", "qualifier": "funding", "schema": "yvv"}, {"key": "yvv.contractresearch.initiative", "value": "student", "language": "", "element": "contractresearch", "qualifier": "initiative", "schema": "yvv"}, {"key": "dc.type.coar", "value": "http://purl.org/coar/resource_type/c_bdcc", "language": null, "element": "type", "qualifier": "coar", "schema": "dc"}, {"key": "dc.rights.accesslevel", "value": "openAccess", "language": null, "element": "rights", "qualifier": "accesslevel", "schema": "dc"}, {"key": "dc.type.publication", "value": "masterThesis", "language": null, "element": "type", "qualifier": "publication", "schema": "dc"}, {"key": "dc.subject.oppiainekoodi", "value": "602", "language": "", "element": "subject", "qualifier": "oppiainekoodi", "schema": "dc"}, {"key": "dc.subject.yso", "value": "puhe (puhuminen)", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "tunteet", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "puheteknologia", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "speech (phenomena)", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "emotions", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.subject.yso", "value": "speech technology", "language": null, "element": "subject", "qualifier": "yso", "schema": "dc"}, {"key": "dc.format.content", "value": "fulltext", "language": null, "element": "format", "qualifier": "content", "schema": "dc"}, {"key": "dc.rights.url", "value": "https://rightsstatements.org/page/InC/1.0/", "language": null, "element": "rights", "qualifier": "url", "schema": "dc"}, {"key": "dc.type.okm", "value": "G2", "language": null, "element": "type", "qualifier": "okm", "schema": "dc"}]
id jyx.123456789_69368
language eng
last_indexed 2025-02-18T10:54:28Z
main_date 2020-01-01T00:00:00Z
main_date_str 2020
online_boolean 1
online_urls_str_mv {"url":"https:\/\/jyx.jyu.fi\/bitstreams\/86d8fbfb-dfde-47a8-983d-d240f9cb4e73\/download","text":"URN:NBN:fi:jyu-202006023626.pdf","source":"jyx","mediaType":"application\/pdf"}
publishDate 2020
record_format qdc
source_str_mv jyx
spellingShingle Amatya, Bipika Emotional speech from machine Emotional Text to speech IBM Watson Tone Analyser emotional speech EmotionML SSML MARY IBM Watson Text to Speech Tietotekniikka Mathematical Information Technology 602 puhe (puhuminen) tunteet puheteknologia speech (phenomena) emotions speech technology
title Emotional speech from machine
title_full Emotional speech from machine
title_fullStr Emotional speech from machine Emotional speech from machine
title_full_unstemmed Emotional speech from machine Emotional speech from machine
title_short Emotional speech from machine
title_sort emotional speech from machine
title_txtP Emotional speech from machine
topic Emotional Text to speech IBM Watson Tone Analyser emotional speech EmotionML SSML MARY IBM Watson Text to Speech Tietotekniikka Mathematical Information Technology 602 puhe (puhuminen) tunteet puheteknologia speech (phenomena) emotions speech technology
topic_facet 602 EmotionML Emotional Text to speech IBM Watson Text to Speech IBM Watson Tone Analyser MARY Mathematical Information Technology SSML Tietotekniikka emotional speech emotions puhe (puhuminen) puheteknologia speech (phenomena) speech technology tunteet
url https://jyx.jyu.fi/handle/123456789/69368 http://www.urn.fi/URN:NBN:fi:jyu-202006023626
work_keys_str_mv AT amatyabipika emotionalspeechfrommachine