GOKCE IYMEN | Koç University (original) (raw)

GOKCE IYMEN

Uploads

Papers by GOKCE IYMEN

Research paper thumbnail of An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era

arXiv (Cornell University), Oct 6, 2022

Speech is the fundamental mode of human communication, and its synthesis has long been a core pri... more Speech is the fundamental mode of human communication, and its synthesis has long been a core priority in human-computer interaction research. In recent years, machines have managed to master the art of generating speech that is understandable by humans. But the linguistic content of an utterance encompasses only a part of its meaning. Affect, or expressivity, has the capacity to turn speech into a medium capable of conveying intimate thoughts, feelings, and emotionsaspects that are essential for engaging and naturalistic interpersonal communication. While the goal of imparting expressivity to synthesised utterances has so far remained elusive, following recent advances in text-to-speech synthesis, a paradigm shift is well under way in the fields of affective speech synthesis and conversion as well. Deep learning, as the technology which underlies most of the recent advances in artificial intelligence, is spearheading these efforts. In the present overview, we outline ongoing trends and summarise state-of-the-art approaches in an attempt to provide a comprehensive overview of this exciting field.

Research paper thumbnail of An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era

Proceedings of the IEEE

Speech is the fundamental mode of human communication, and its synthesis has long been a core pri... more Speech is the fundamental mode of human communication, and its synthesis has long been a core priority in human-computer interaction research. In recent years, machines have managed to master the art of generating speech that is understandable by humans. But the linguistic content of an utterance encompasses only a part of its meaning. Affect, or expressivity, has the capacity to turn speech into a medium capable of conveying intimate thoughts, feelings, and emotionsaspects that are essential for engaging and naturalistic interpersonal communication. While the goal of imparting expressivity to synthesised utterances has so far remained elusive, following recent advances in text-to-speech synthesis, a paradigm shift is well under way in the fields of affective speech synthesis and conversion as well. Deep learning, as the technology which underlies most of the recent advances in artificial intelligence, is spearheading these efforts. In the present overview, we outline ongoing trends and summarise state-of-the-art approaches in an attempt to provide a comprehensive overview of this exciting field.

Research paper thumbnail of Selected Applications of Generative AdversarialNetworks: Mini Review

COJ Robotics & Artificial Intelligence, Aug 6, 2020

Generative adversarial networks have become increasingly popular since they were first introduced... more Generative adversarial networks have become increasingly popular since they were first introduced in 2014. Many variants of GANs have been developed over the years and employed in a range of applications from computer vision to audio generation and medical imaging. As its applications in computer vision have been widely explored by the artificial intelligence community, here, we focus on more specific applications of GANs, namely audio generation and medical image synthesis. In the age of big data, these two fields still struggle with the scarcity of labelled data, hence they benefit greatly from the capabilities of GANs.

Research paper thumbnail of Artificial intelligence-based identification of butter variations as a model study for detecting food adulteration

Innovative Food Science & Emerging Technologies

Research paper thumbnail of An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era

arXiv (Cornell University), Oct 6, 2022

Speech is the fundamental mode of human communication, and its synthesis has long been a core pri... more Speech is the fundamental mode of human communication, and its synthesis has long been a core priority in human-computer interaction research. In recent years, machines have managed to master the art of generating speech that is understandable by humans. But the linguistic content of an utterance encompasses only a part of its meaning. Affect, or expressivity, has the capacity to turn speech into a medium capable of conveying intimate thoughts, feelings, and emotionsaspects that are essential for engaging and naturalistic interpersonal communication. While the goal of imparting expressivity to synthesised utterances has so far remained elusive, following recent advances in text-to-speech synthesis, a paradigm shift is well under way in the fields of affective speech synthesis and conversion as well. Deep learning, as the technology which underlies most of the recent advances in artificial intelligence, is spearheading these efforts. In the present overview, we outline ongoing trends and summarise state-of-the-art approaches in an attempt to provide a comprehensive overview of this exciting field.

Research paper thumbnail of An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era

Proceedings of the IEEE

Speech is the fundamental mode of human communication, and its synthesis has long been a core pri... more Speech is the fundamental mode of human communication, and its synthesis has long been a core priority in human-computer interaction research. In recent years, machines have managed to master the art of generating speech that is understandable by humans. But the linguistic content of an utterance encompasses only a part of its meaning. Affect, or expressivity, has the capacity to turn speech into a medium capable of conveying intimate thoughts, feelings, and emotionsaspects that are essential for engaging and naturalistic interpersonal communication. While the goal of imparting expressivity to synthesised utterances has so far remained elusive, following recent advances in text-to-speech synthesis, a paradigm shift is well under way in the fields of affective speech synthesis and conversion as well. Deep learning, as the technology which underlies most of the recent advances in artificial intelligence, is spearheading these efforts. In the present overview, we outline ongoing trends and summarise state-of-the-art approaches in an attempt to provide a comprehensive overview of this exciting field.

Research paper thumbnail of Selected Applications of Generative AdversarialNetworks: Mini Review

COJ Robotics & Artificial Intelligence, Aug 6, 2020

Generative adversarial networks have become increasingly popular since they were first introduced... more Generative adversarial networks have become increasingly popular since they were first introduced in 2014. Many variants of GANs have been developed over the years and employed in a range of applications from computer vision to audio generation and medical imaging. As its applications in computer vision have been widely explored by the artificial intelligence community, here, we focus on more specific applications of GANs, namely audio generation and medical image synthesis. In the age of big data, these two fields still struggle with the scarcity of labelled data, hence they benefit greatly from the capabilities of GANs.

Research paper thumbnail of Artificial intelligence-based identification of butter variations as a model study for detecting food adulteration

Innovative Food Science & Emerging Technologies

Log In