Miroslava Mrvova - Profile on Academia.edu (original) (raw)
Papers by Miroslava Mrvova
Communications - Scientific letters of the University of Zilina, 2011
This contribution deals with the issue of quality of synthesized speech. It introduces principles... more This contribution deals with the issue of quality of synthesized speech. It introduces principles and approaches of creating this type of speech and basic methods and techniques used to assess the quality of synthesized speech. This article also offers a short overview of relevant experimental studies discussing issues related to this kind of speech and its quality assessment. Finally, it investigates effect of the newest coding approaches (e.g. Speex, iLBC, EVRC-B, etc.) on quality of naturally-produced speech and synthesized speech (generated by diphone and unit-selection synthesizers) predicted by two different objective models and provided by subjective tests.
Communications - Scientific letters of the University of Zilina, 2014
Acta Acustica united with Acustica, 2009
In this work, we experimentally study howbehaviour of the PESQ predictions varies with reference ... more In this work, we experimentally study howbehaviour of the PESQ predictions varies with reference signal characteristic. In particular we investigate the impact of different Active-Speech-Ratios on speech quality prediction in simulated Vo IP environment from objective and subjective testing point of view. This reference signal characteristic is defined very broadly by ITU-T Recommendation P. 862.3.T hat is the reason to investigate an impact of this characteristic on speech quality prediction more in-depth. We assess the variability of PESQ's predictions with respect to Active-Speech-Ratio and network conditions, as well as their accuracy, by comparing the predictions with subjective assessments.
A systematic study of PESQ's behavior in simulated VoIP environment (from reference signal characteristics perspective)
Proceedings of …, 2008
Page 1. A systematic study of PESQ's behavior in simulated VoIP environment (from re... more Page 1. A systematic study of PESQ's behavior in simulated VoIP environment (from reference signal characteristics perspective) Peter Počta1, Miroslava Mrvová1, Peter Korti1, Peter Palúch2, Martin Vaculík1 1Dept. of Telecommunications ...
Derivation of Speech Activity Parameter Values in the Context of Speech Quality Testing
ABSTRACT As proven by many scientific papers, the time-varying impairments play crucial role in V... more ABSTRACT As proven by many scientific papers, the time-varying impairments play crucial role in VoIP applications. On the other hand, the reference signals used for speech quality assessment are characterized by following parameters: length of the signal and speech activity parameter. Despite the facts that activity parameter is one of the important characteristics of reference signals for objective speech quality measurements defined in Section 7 of the ITU-T Recommendation P.862.3 (also considered in brand new ITU-T Recommendation P.863) and has been proven as crucial input parameter for subjective and objective speech quality assessment in presence of packet loss, the exact values of this parameter with regard to different conversation scenarios are still missing. This study addresses this shortcoming by deriving a formula for computing activity parameter of arbitrary conversation scenario. A serviceability of the proposed formula is demonstrated. Finally, other issues related to creating the reference speech samples for speech quality assessment (number of speech utterances and sample pattern, etc.) and potential application areas of the derived formula are pointed out.
Novel parameter-based models estimating quality of synthesized speech transmitted over IP network based on Genetic Programming approach
2013 23rd International Conference Radioelektronika (RADIOELEKTRONIKA), 2013
ABSTRACT In this paper, Genetic Programming (GP) based on symbolic regression approach [1] was us... more ABSTRACT In this paper, Genetic Programming (GP) based on symbolic regression approach [1] was used to design parameter-based speech quality estimation models. In particular, the models have been designed to estimate a quality of synthesized speech transmitted over IP channel. In principle, the idea is to apply an appropriate set of quality-affecting parameters (e.g. parameters characterizing packet loss process, speech codec type, type of synthesized speech) as an input of the designed estimation models. Those quality-affecting parameters together with the corresponding speech quality values predicted by PESQ (Perceptual Evaluation of Speech Quality) [2] are used in training process of the designed models in order to define a relationship between the used quality-affecting parameters and the corresponding speech quality values. Regarding the usage of PESQ as a source of speech quality values, the experiments presented in [3] have proven that PESQ is able to provide accurate predictions of quality of synthesized speech impaired by the impairments used in this study. This study has shown that all designed models provide accurate estimations of quality of synthesized speech transmitted over IP network. An accuracy of the estimations was quantified in terms of the Pearson correlation coefficient R, the respective root mean square error (rmse) and epsilon-insensitive root mean square error (rmse*). The developed models can be useful for network operators and service providers in planning phase or early-development stage of telecommunication services based on synthesized speech.
Quality estimation of synthesized speech transmitted over IP channel using genetic programming approach
The International Conference on Digital Technologies 2013, 2013
ABSTRACT In this article, an evolutionary algorithm known as Genetic Programming (GP) was used to... more ABSTRACT In this article, an evolutionary algorithm known as Genetic Programming (GP) was used to design a parametric speech quality estimation model. Nowadays, GP is one of the machine learning techniques employed in a quality estimation process. In principle, the set of quality-affecting parameters was used as an input to the designed estimation model based on GP approach in order to estimate a quality of synthesized speech transmitted over IP channel (VoIP environment). The performance results obtained by the designed estimation model have confirmed the good properties of genetic programming, namely good accuracy and generalization ability; this makes it to be perspective approach to a quality estimation of this type of speech in the corresponding environment. The developed model can be helpful for network operators and service providers implementing it in planning phase or early-development stage of telecommunication services based on synthesized speech.
This paper deals with the investigation of PESQ's behavior under independent and dependent loss c... more This paper deals with the investigation of PESQ's behavior under independent and dependent loss conditions from an Active-Speech-Ratio perspective in presence of receiver-side comfort-noise. This reference signal characteristic is defined very broadly by ITU-T Recommendation P.862.3. That is the reason to investigate an impact of this characteristic on speech quality prediction more in-depth. We assess the variability of PESQ's predictions with respect to Active-Speech-Ratios and loss conditions, as well as their accuracy, by comparing the predictions with subjective assessments. Our results show that an increase in amount of speech in the reference signal (expressed by the Active-Speech-Ratio characteristic) may result in an increase of the reference signal sensitivity to packet loss change. Interestingly, we have found two additional effects in this investigated case. The use of higher Active-Speech-Ratios may lead to negative shifting effect in MOS domain and also PESQ's predictions accuracy declining. Predictions accuracy could be improved by higher packet losses.
MESAQIN 2009, Jun 2009
This paper deals with the investigation of PESQ's behavior under independent and dependent loss c... more This paper deals with the investigation of PESQ's behavior under independent and dependent loss conditions from an Active-Speech-Ratio perspective. This reference signal characteristic is defined very broadly by ITU-T Recommendation P.862.3. That is the reason to investigate an impact of this characteristic on speech quality prediction more in-depth. The ITU-T G.729AB encoding scheme is deployed in this study. We assess the variability of PESQ's predictions with respect to Active-Speech-Ratios and loss conditions. Our results show that an increase in amount of speech in the reference signal (expressed by the Active-Speech-Ratio characteristic) may result in an increase of the reference signal sensitivity to packet loss change and also PESQ's predictions accuracy improving. Predictions accuracy could be even improved by higher packet losses.
Novel parameter-based models estimating quality of synthesized speech transmitted over IP network based on Genetic Programming approach
Radioelektronika 2013, Apr 2013
In this paper, Genetic Programming (GP) based on symbolic regression approach [1] was used to des... more In this paper, Genetic Programming (GP) based on symbolic regression approach [1] was used to design parameter-based speech quality estimation models. In particular, the models have been designed to estimate a quality of synthesized speech transmitted over IP channel. In principle, the idea is to apply an appropriate set of quality-affecting parameters (e.g. parameters characterizing packet loss process, speech codec type, type of synthesized speech) as an input of the designed estimation models. Those quality-affecting parameters together with the corresponding speech quality values predicted by PESQ (Perceptual Evaluation of Speech Quality) [2] are used in training process of the designed models in order to define a relationship between the used quality-affecting parameters and the corresponding speech quality values. Regarding the usage of PESQ as a source of speech quality values, the experiments presented in [3] have proven that PESQ is able to provide accurate predictions of quality of synthesized speech impaired by the impairments used in this study. This study has shown that all designed models provide accurate estimations of quality of synthesized speech transmitted over IP network. An accuracy of the estimations was quantified in terms of the Pearson correlation coefficient R, the respective root mean square error (rmse) and epsilon-insensitive root mean square error (rmse*). The developed models can be useful for network operators and service providers in planning phase or early-development stage of telecommunication services based on synthesized speech.
Digital Technologies (DT 2013), May 2013
In this article, an evolutionary algorithm known as Genetic Programming (GP) was used to design a... more In this article, an evolutionary algorithm known as Genetic Programming (GP) was used to design a parametric speech quality estimation model. Nowadays, GP is one of the machine learning techniques employed in a quality estimation process. In principle, the set of quality-affecting parameters was used as an input to the designed estimation model based on GP approach in order to estimate a quality of synthesized speech transmitted over IP channel (VoIP environment). The performance results obtained by the designed estimation model have confirmed the good properties of genetic programming, namely good accuracy and generalization ability; this makes it to be perspective approach to a quality estimation of this type of speech in the corresponding environment. The developed model can be helpful for network operators and service providers implementing it in planning phase or early-development stage of telecommunication services based on synthesized speech.
MESAQIN 2011, Jun 2011
As proven by many scientific papers, the time-varying impairments play crucial role in VoIP appli... more As proven by many scientific papers, the time-varying impairments play crucial role in VoIP applications. On the other hand, the reference signals used for speech quality assessment are characterized by following parameters: length of the signal and speech activity parameter. Despite the facts that activity parameter is one of the important characteristics of reference signals for objective speech quality measurements defined in Section 7 of the ITU-T Recommendation P.862.3 (also considered in brand new ITU-T Recommendation P.863) and has been proven as crucial input parameter for subjective and objective speech quality assessment in presence of packet loss, the exact values of this parameter with regard to different conversation scenarios are still missing. This study addresses this shortcoming by deriving a formula for computing activity parameter of arbitrary conversation scenario. A serviceability of the proposed formula is demonstrated. Finally, other issues related to creating the reference speech samples for speech quality assessment (number of speech utterances and sample pattern, etc.) and potential application areas of the derived formula are pointed out.
A design of the parametric models estimating a quality of synthesized speech transmitted through ... more A design of the parametric models estimating a quality of synthesized speech transmitted through IP networks is presented in this paper. A Genetic Programming and Random Neural Network as machine learning techniques were deployed to design the models. A set of the quality-affecting parameters was used as an input to the designed parametric estimation models in order to estimate a quality of synthesized speech transmitted over IP networks (VoIP environment). The performance results obtained for the designed parametric estimation models have validated both genetic programming and random neural network as powerful techniques, delivering good accuracy and generalization ability; this makes them perspective candidates for quality estimation of this type of speech in the corresponding environment. The developed parametric models can be helpful for network operators and service providers in a planning phase or early-development stage of telecommunication services based on synthesized speech.
In this work, we experimentally study howbehaviour of the PESQ predictions varies with reference ... more In this work, we experimentally study howbehaviour of the PESQ predictions varies with reference signal characteristic. In particular we investigate the impact of different Active-Speech-Ratios on speech quality prediction in simulated Vo IP environment from objective and subjective testing point of view. This reference signal characteristic is defined very broadly by ITU-T Recommendation P. 862.3.T hat is the reason to investigate an impact of this characteristic on speech quality prediction more in-depth. We assess the variability of PESQ's predictions with respect to Active-Speech-Ratio and network conditions, as well as their accuracy, by comparing the predictions with subjective assessments. PACS no. 43.71.Gv,43.72.Kb 950 ©S.Hirzel Verlag · EAA Počta et al.:P ESQ'sbehaviour under different ASR ACTA ACUSTICA UNITED WITH ACUSTICA Vol. 95 (2009)
MESAQIN 2008, Jun 2008
In this work, we experimentally study how behaviour of the PESQ-estimate varies with reference si... more In this work, we experimentally study how behaviour of the PESQ-estimate varies with reference signal characteristics. In particular we investigate the impact of different lengths of reference signal and active speech ratios on speech quality estimation in simulated VoIP environment. These two reference signal characteristics are defined very broadly by ITU-T Recommendation P.862.3. That is reason to investigate an impact of those characteristics on speech quality estimation more in depth. We assess the variability of PESQ estimations with respect to the reference signal characteristics and network conditions and finally offer some proposals for the purpose of more accurate and reliable speech quality assessment from those reference signal characteristics point of view in IP networks.
Communications - Scientific letters of the University of Zilina, 2011
This contribution deals with the issue of quality of synthesized speech. It introduces principles... more This contribution deals with the issue of quality of synthesized speech. It introduces principles and approaches of creating this type of speech and basic methods and techniques used to assess the quality of synthesized speech. This article also offers a short overview of relevant experimental studies discussing issues related to this kind of speech and its quality assessment. Finally, it investigates effect of the newest coding approaches (e.g. Speex, iLBC, EVRC-B, etc.) on quality of naturally-produced speech and synthesized speech (generated by diphone and unit-selection synthesizers) predicted by two different objective models and provided by subjective tests.
Communications - Scientific letters of the University of Zilina, 2014
Acta Acustica united with Acustica, 2009
In this work, we experimentally study howbehaviour of the PESQ predictions varies with reference ... more In this work, we experimentally study howbehaviour of the PESQ predictions varies with reference signal characteristic. In particular we investigate the impact of different Active-Speech-Ratios on speech quality prediction in simulated Vo IP environment from objective and subjective testing point of view. This reference signal characteristic is defined very broadly by ITU-T Recommendation P. 862.3.T hat is the reason to investigate an impact of this characteristic on speech quality prediction more in-depth. We assess the variability of PESQ's predictions with respect to Active-Speech-Ratio and network conditions, as well as their accuracy, by comparing the predictions with subjective assessments.
A systematic study of PESQ's behavior in simulated VoIP environment (from reference signal characteristics perspective)
Proceedings of …, 2008
Page 1. A systematic study of PESQ's behavior in simulated VoIP environment (from re... more Page 1. A systematic study of PESQ's behavior in simulated VoIP environment (from reference signal characteristics perspective) Peter Počta1, Miroslava Mrvová1, Peter Korti1, Peter Palúch2, Martin Vaculík1 1Dept. of Telecommunications ...
Derivation of Speech Activity Parameter Values in the Context of Speech Quality Testing
ABSTRACT As proven by many scientific papers, the time-varying impairments play crucial role in V... more ABSTRACT As proven by many scientific papers, the time-varying impairments play crucial role in VoIP applications. On the other hand, the reference signals used for speech quality assessment are characterized by following parameters: length of the signal and speech activity parameter. Despite the facts that activity parameter is one of the important characteristics of reference signals for objective speech quality measurements defined in Section 7 of the ITU-T Recommendation P.862.3 (also considered in brand new ITU-T Recommendation P.863) and has been proven as crucial input parameter for subjective and objective speech quality assessment in presence of packet loss, the exact values of this parameter with regard to different conversation scenarios are still missing. This study addresses this shortcoming by deriving a formula for computing activity parameter of arbitrary conversation scenario. A serviceability of the proposed formula is demonstrated. Finally, other issues related to creating the reference speech samples for speech quality assessment (number of speech utterances and sample pattern, etc.) and potential application areas of the derived formula are pointed out.
Novel parameter-based models estimating quality of synthesized speech transmitted over IP network based on Genetic Programming approach
2013 23rd International Conference Radioelektronika (RADIOELEKTRONIKA), 2013
ABSTRACT In this paper, Genetic Programming (GP) based on symbolic regression approach [1] was us... more ABSTRACT In this paper, Genetic Programming (GP) based on symbolic regression approach [1] was used to design parameter-based speech quality estimation models. In particular, the models have been designed to estimate a quality of synthesized speech transmitted over IP channel. In principle, the idea is to apply an appropriate set of quality-affecting parameters (e.g. parameters characterizing packet loss process, speech codec type, type of synthesized speech) as an input of the designed estimation models. Those quality-affecting parameters together with the corresponding speech quality values predicted by PESQ (Perceptual Evaluation of Speech Quality) [2] are used in training process of the designed models in order to define a relationship between the used quality-affecting parameters and the corresponding speech quality values. Regarding the usage of PESQ as a source of speech quality values, the experiments presented in [3] have proven that PESQ is able to provide accurate predictions of quality of synthesized speech impaired by the impairments used in this study. This study has shown that all designed models provide accurate estimations of quality of synthesized speech transmitted over IP network. An accuracy of the estimations was quantified in terms of the Pearson correlation coefficient R, the respective root mean square error (rmse) and epsilon-insensitive root mean square error (rmse*). The developed models can be useful for network operators and service providers in planning phase or early-development stage of telecommunication services based on synthesized speech.
Quality estimation of synthesized speech transmitted over IP channel using genetic programming approach
The International Conference on Digital Technologies 2013, 2013
ABSTRACT In this article, an evolutionary algorithm known as Genetic Programming (GP) was used to... more ABSTRACT In this article, an evolutionary algorithm known as Genetic Programming (GP) was used to design a parametric speech quality estimation model. Nowadays, GP is one of the machine learning techniques employed in a quality estimation process. In principle, the set of quality-affecting parameters was used as an input to the designed estimation model based on GP approach in order to estimate a quality of synthesized speech transmitted over IP channel (VoIP environment). The performance results obtained by the designed estimation model have confirmed the good properties of genetic programming, namely good accuracy and generalization ability; this makes it to be perspective approach to a quality estimation of this type of speech in the corresponding environment. The developed model can be helpful for network operators and service providers implementing it in planning phase or early-development stage of telecommunication services based on synthesized speech.
This paper deals with the investigation of PESQ's behavior under independent and dependent loss c... more This paper deals with the investigation of PESQ's behavior under independent and dependent loss conditions from an Active-Speech-Ratio perspective in presence of receiver-side comfort-noise. This reference signal characteristic is defined very broadly by ITU-T Recommendation P.862.3. That is the reason to investigate an impact of this characteristic on speech quality prediction more in-depth. We assess the variability of PESQ's predictions with respect to Active-Speech-Ratios and loss conditions, as well as their accuracy, by comparing the predictions with subjective assessments. Our results show that an increase in amount of speech in the reference signal (expressed by the Active-Speech-Ratio characteristic) may result in an increase of the reference signal sensitivity to packet loss change. Interestingly, we have found two additional effects in this investigated case. The use of higher Active-Speech-Ratios may lead to negative shifting effect in MOS domain and also PESQ's predictions accuracy declining. Predictions accuracy could be improved by higher packet losses.
MESAQIN 2009, Jun 2009
This paper deals with the investigation of PESQ's behavior under independent and dependent loss c... more This paper deals with the investigation of PESQ's behavior under independent and dependent loss conditions from an Active-Speech-Ratio perspective. This reference signal characteristic is defined very broadly by ITU-T Recommendation P.862.3. That is the reason to investigate an impact of this characteristic on speech quality prediction more in-depth. The ITU-T G.729AB encoding scheme is deployed in this study. We assess the variability of PESQ's predictions with respect to Active-Speech-Ratios and loss conditions. Our results show that an increase in amount of speech in the reference signal (expressed by the Active-Speech-Ratio characteristic) may result in an increase of the reference signal sensitivity to packet loss change and also PESQ's predictions accuracy improving. Predictions accuracy could be even improved by higher packet losses.
Novel parameter-based models estimating quality of synthesized speech transmitted over IP network based on Genetic Programming approach
Radioelektronika 2013, Apr 2013
In this paper, Genetic Programming (GP) based on symbolic regression approach [1] was used to des... more In this paper, Genetic Programming (GP) based on symbolic regression approach [1] was used to design parameter-based speech quality estimation models. In particular, the models have been designed to estimate a quality of synthesized speech transmitted over IP channel. In principle, the idea is to apply an appropriate set of quality-affecting parameters (e.g. parameters characterizing packet loss process, speech codec type, type of synthesized speech) as an input of the designed estimation models. Those quality-affecting parameters together with the corresponding speech quality values predicted by PESQ (Perceptual Evaluation of Speech Quality) [2] are used in training process of the designed models in order to define a relationship between the used quality-affecting parameters and the corresponding speech quality values. Regarding the usage of PESQ as a source of speech quality values, the experiments presented in [3] have proven that PESQ is able to provide accurate predictions of quality of synthesized speech impaired by the impairments used in this study. This study has shown that all designed models provide accurate estimations of quality of synthesized speech transmitted over IP network. An accuracy of the estimations was quantified in terms of the Pearson correlation coefficient R, the respective root mean square error (rmse) and epsilon-insensitive root mean square error (rmse*). The developed models can be useful for network operators and service providers in planning phase or early-development stage of telecommunication services based on synthesized speech.
Digital Technologies (DT 2013), May 2013
In this article, an evolutionary algorithm known as Genetic Programming (GP) was used to design a... more In this article, an evolutionary algorithm known as Genetic Programming (GP) was used to design a parametric speech quality estimation model. Nowadays, GP is one of the machine learning techniques employed in a quality estimation process. In principle, the set of quality-affecting parameters was used as an input to the designed estimation model based on GP approach in order to estimate a quality of synthesized speech transmitted over IP channel (VoIP environment). The performance results obtained by the designed estimation model have confirmed the good properties of genetic programming, namely good accuracy and generalization ability; this makes it to be perspective approach to a quality estimation of this type of speech in the corresponding environment. The developed model can be helpful for network operators and service providers implementing it in planning phase or early-development stage of telecommunication services based on synthesized speech.
MESAQIN 2011, Jun 2011
As proven by many scientific papers, the time-varying impairments play crucial role in VoIP appli... more As proven by many scientific papers, the time-varying impairments play crucial role in VoIP applications. On the other hand, the reference signals used for speech quality assessment are characterized by following parameters: length of the signal and speech activity parameter. Despite the facts that activity parameter is one of the important characteristics of reference signals for objective speech quality measurements defined in Section 7 of the ITU-T Recommendation P.862.3 (also considered in brand new ITU-T Recommendation P.863) and has been proven as crucial input parameter for subjective and objective speech quality assessment in presence of packet loss, the exact values of this parameter with regard to different conversation scenarios are still missing. This study addresses this shortcoming by deriving a formula for computing activity parameter of arbitrary conversation scenario. A serviceability of the proposed formula is demonstrated. Finally, other issues related to creating the reference speech samples for speech quality assessment (number of speech utterances and sample pattern, etc.) and potential application areas of the derived formula are pointed out.
A design of the parametric models estimating a quality of synthesized speech transmitted through ... more A design of the parametric models estimating a quality of synthesized speech transmitted through IP networks is presented in this paper. A Genetic Programming and Random Neural Network as machine learning techniques were deployed to design the models. A set of the quality-affecting parameters was used as an input to the designed parametric estimation models in order to estimate a quality of synthesized speech transmitted over IP networks (VoIP environment). The performance results obtained for the designed parametric estimation models have validated both genetic programming and random neural network as powerful techniques, delivering good accuracy and generalization ability; this makes them perspective candidates for quality estimation of this type of speech in the corresponding environment. The developed parametric models can be helpful for network operators and service providers in a planning phase or early-development stage of telecommunication services based on synthesized speech.
In this work, we experimentally study howbehaviour of the PESQ predictions varies with reference ... more In this work, we experimentally study howbehaviour of the PESQ predictions varies with reference signal characteristic. In particular we investigate the impact of different Active-Speech-Ratios on speech quality prediction in simulated Vo IP environment from objective and subjective testing point of view. This reference signal characteristic is defined very broadly by ITU-T Recommendation P. 862.3.T hat is the reason to investigate an impact of this characteristic on speech quality prediction more in-depth. We assess the variability of PESQ's predictions with respect to Active-Speech-Ratio and network conditions, as well as their accuracy, by comparing the predictions with subjective assessments. PACS no. 43.71.Gv,43.72.Kb 950 ©S.Hirzel Verlag · EAA Počta et al.:P ESQ'sbehaviour under different ASR ACTA ACUSTICA UNITED WITH ACUSTICA Vol. 95 (2009)
MESAQIN 2008, Jun 2008
In this work, we experimentally study how behaviour of the PESQ-estimate varies with reference si... more In this work, we experimentally study how behaviour of the PESQ-estimate varies with reference signal characteristics. In particular we investigate the impact of different lengths of reference signal and active speech ratios on speech quality estimation in simulated VoIP environment. These two reference signal characteristics are defined very broadly by ITU-T Recommendation P.862.3. That is reason to investigate an impact of those characteristics on speech quality estimation more in depth. We assess the variability of PESQ estimations with respect to the reference signal characteristics and network conditions and finally offer some proposals for the purpose of more accurate and reliable speech quality assessment from those reference signal characteristics point of view in IP networks.