Self supervised learning for robust voice cloning (original) (raw)

Voice Cloning Using Transfer Learning with Audio Samples

Usman Nawaz, Usman Ahmed Raza

UMT Artificial Intelligence Review (UMT-AIR) , 2023

View PDFchevron_right

Voice Cloning: a Multi-Speaker Text-to-Speech Synthesis Approach based on Transfer Learning

Vincent Pollet

ArXiv, 2021

View PDFchevron_right

Autotuned voice cloning enabling multilingualism

IRJET Journal

View PDFchevron_right

Speaker verification-derived loss and data augmentation for DNN-based multispeaker speech synthesis

Mircea Giurgiu

2021 29th European Signal Processing Conference (EUSIPCO), 2021

View PDFchevron_right

Multi-speaker TTS with Deep Learning

Ivan Carapinha

2020

View PDFchevron_right

Preliminary experiments toward automatic generation of new TTS voices from recorded speech alone

Masafumi Nishimura

Interspeech 2007, 2007

View PDFchevron_right

Data Efficient Voice Cloning from Noisy Samples with Domain Adversarial Training

guanglu wan

2020

View PDFchevron_right

Waveform-Based Speaker Representations for Speech Synthesis

Moquan Wan

Interspeech 2018

View PDFchevron_right

Introducing Prosodic Speaker Identity for a Better Expressive Speech Synthesis Control

Aghilas SINI

10th International Conference on Speech Prosody 2020, 2020

View PDFchevron_right

Combining Statistical Parameteric Speech Synthesis and Unit-Selection for Automatic Voice Cloning

Matthew Aylett

View PDFchevron_right

ITAcotron 2: the Power of Transfer Learning in Expressive TTS Synthesis

Roberto Tedesco

Analysis and Application of Natural Language and Speech Processing, 2023

View PDFchevron_right

Hearing Faces: Target Speaker Text-to-Speech Synthesis from a Face

Leyuan Qu

2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)

View PDFchevron_right

Integrated speaker-adaptive speech synthesis

Moquan Wan

2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)

View PDFchevron_right

Development of a Genre-Dependent TTS System with Cross-Speaker Speaking-Style Transplantation

Ruben gonzalo hernandez

View PDFchevron_right

Learning Speaker Embedding from Text-to-Speech

Jesus Villalba

Interspeech 2020, 2020

View PDFchevron_right

An objective evaluation of the effects of recording conditions and speaker characteristics in multi-speaker deep neural speech synthesis

Mircea Giurgiu

Procedia Computer Science, 2021

View PDFchevron_right

Learning Robust Latent Representations for Controllable Speech Synthesis

Jithin Pradeep

Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, 2021

View PDFchevron_right

Self-Supervised Speaker Embeddings

Themos Stafylakis

Interspeech 2019

View PDFchevron_right

High Fidelity Speech Regeneration with Application to Speech Enhancement

Yossi Adi

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

View PDFchevron_right

Code-Switching Speech Synthesis Based on Self-Supervised Learning and Domain Adaptive Speaker Encoder

bima prihasto

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

View PDFchevron_right

Multi-speaker Multi-style Text-to-speech Synthesis With Single-speaker Single-style Training Data Scenarios

guanglu wan

ArXiv, 2021

View PDFchevron_right

Karaoker: Alignment-free singing voice synthesis with speech training data

Panos Kakoulidis, Gunu Jho

Interspeech 2022

View PDFchevron_right

ASVspoof 2019: a large-scale public database of synthetic, converted and replayed speech

Yu Tsao

2020

View PDFchevron_right

Explicit Prosodic Modelling and Deep Speaker Embedding Learning for Non-standard Voice Conversion

Helen Meng

arXiv: Audio and Speech Processing, 2020

View PDFchevron_right

Voice Cloning Applied to Voice Disorders: a Study of Extreme Phonetic Content in Speaker Embeddings

Damien Lolive

Proceedings of the Canadian Conference on Artificial Intelligence

View PDFchevron_right

Principal Style Components: Expressive Style Control and Cross-Speaker Transfer in Neural TTS

Ron Hoory

Interspeech 2020, 2020

View PDFchevron_right

The voice synthesis business: 2022 update

Robert Dale

Natural Language Engineering

View PDFchevron_right

Improving multi-speaker TTS prosody variance with a residual encoder and normalizing flows

Roberto Barra Chicote

ArXiv, 2021

View PDFchevron_right

SYNTACC : Synthesizing Multi-Accent Speech By Weight Factorization

Alexander Waibel

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

View PDFchevron_right

ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech

Tomi Kinnunen

2020

View PDFchevron_right

Phone-Level Embeddings for Unit Selection Speech Synthesis

Damien Lolive

Statistical Language and Speech Processing, 2018

View PDFchevron_right