【Hackathon 8th No.9】在 PaddleSpeech 中复现 DAC 训练需要用到的 loss by cchenhaifeng · Pull Request #3988 · PaddlePaddle/PaddleSpeech (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation67 Commits8 Checks2 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

@cchenhaifeng

@paddle-bot

Thanks for your contribution!

@CLAassistant

CLA assistant check
All committers have signed the CLA.

@cchenhaifeng

image
这是我在本地测试过后,拿到的数值结果,请检查

@cchenhaifeng

zxcd

"DDSP: Differentiable Digital Signal Processing."
International Conference on Learning Representations. 2019.
Implementation copied from: https://github.com/descriptinc/lyrebird-audiotools/blob/961786aa1a9d628cca0c0486e5885a457fe70c1a/audiotools/metrics/spectral.py

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't use develop branch?

def test_multi_scale_stft_loss():
a = np.linspace(0, 199.98, 10000)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggest use true wav to test loss.

def test_sisdr_loss():
a = np.linspace(0, 199.98, 10000)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same with above

def __call__(self, x):
return x * (-0.2)
a = np.linspace(0, 199.98, 10000)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same with above

weight : float, optional
Weight of this loss, defaults to 1.0.
Implementation copied from: https://github.com/descriptinc/lyrebird-audiotools/blob/961786aa1a9d628cca0c0486e5885a457fe70c1a/audiotools/metrics/distance.py

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same with above

pow: float=2.0,
weight: float=1.0,
match_stride: bool=False,
window_type: str=None, ):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when default value is None, suggest use Optional type

def __init__(
self,
scaling: int=True,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

int=True seems somewhat ambiguous

x: Union[AudioSignal, paddle.Tensor],
y: Union[AudioSignal, paddle.Tensor]):
eps = 1e-8
# nb, nc, nt

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you mean tensor shape, suggest use (B, C, T)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have modified it as requested. Please check it.

@zxcd

Are you interested in participating in the training of the DAC model (Hackathon 8th No.5) as well?

@cchenhaifeng

Are you interested in participating in the training of the DAC model (Hackathon 8th No.5) as well?

我看一下?

@cchenhaifeng

任务五似乎是要先把任务9完成才能做的?那我是先等任务9合进去再开始嘛? @zxcd

@luotao1

@cchenhaifeng

是的,请先完成任务9

啊,任务9已经完成可以检查了的。

@luotao1

zxcd

def get_input():
x = np.array([

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

zxcd

@cchenhaifeng

我觉得应该找一下做audiotools这个pr的人复核一下,他的导包似乎有问题。 @luotao1

@luotao1

可以进如流群讨论
image

我觉得应该找一下做audiotools这个pr的人复核一下,他的导包似乎有问题

也请 @DrRyanHuang 有空看下

@DrRyanHuang

我觉得应该找一下做audiotools这个pr的人复核一下,他的导包似乎有问题。

@cchenhaifeng
Currently, since audiotools exists independently within PaddleSpeech, its installation and testing are also conducted separately. If you wish to use audiotools, you can install it yourself by following the requirements specified in paddlespeech/audiotools/requirements.txt.

This is demonstrated in the testing of audiotools:
https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/tests/unit/audiotools/test_audiotools.sh#L1

I previously encountered the issue of cyclic imports as well. In this PR, you can fix it using deferred imports; I see that you’ve already done so. If there are any other issues, we can communicate at any time.

@cchenhaifeng

@cchenhaifeng

我觉得应该找一下做audiotools这个pr的人复核一下,他的导包似乎有问题。

@cchenhaifeng Currently, since audiotools exists independently within PaddleSpeech, its installation and testing are also conducted separately. If you wish to use audiotools, you can install it yourself by following the requirements specified in paddlespeech/audiotools/requirements.txt.

This is demonstrated in the testing of audiotools: https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/tests/unit/audiotools/test_audiotools.sh#L1

I previously encountered the issue of cyclic imports as well. In this PR, you can fix it using deferred imports; I see that you’ve already done so. If there are any other issues, we can communicate at any time.

Okay, I'll get back to you if I have any other questions

@cchenhaifeng

我觉得应该找一下做audiotools这个pr的人复核一下,他的导包似乎有问题。

@cchenhaifeng Currently, since audiotools exists independently within PaddleSpeech, its installation and testing are also conducted separately. If you wish to use audiotools, you can install it yourself by following the requirements specified in paddlespeech/audiotools/requirements.txt.

This is demonstrated in the testing of audiotools: https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/tests/unit/audiotools/test_audiotools.sh#L1

I previously encountered the issue of cyclic imports as well. In this PR, you can fix it using deferred imports; I see that you’ve already done so. If there are any other issues, we can communicate at any time.

image
Is it normal for the test to get stuck here all the time?

@DrRyanHuang

Is it normal for the test to get stuck here all the time?

Well, it's not normal. Does the code pass the unit tests for utils.py on your local machine? If it does, then the issue might be with the CI system.

@cchenhaifeng

Is it normal for the test to get stuck here all the time?

Well, it's not normal. Does the code pass the unit tests for utils.py on your local machine? If it does, then the issue might be with the CI system.

I haven't tried this, I'm testing test_losses.py , but I'm just modifying the packet method, not the logic diamagnetic, and theoretically it should pass.

@cchenhaifeng

Is it normal for the test to get stuck here all the time?

Well, it's not normal. Does the code pass the unit tests for utils.py on your local machine? If it does, then the issue might be with the CI system.

The local results are correct

@DrRyanHuang

The local results are correct

The CI environment uses:

Have you aligned your PaddlePaddle version with the CI environment?
We need to support both Paddle 2.5 and Paddle 2.6.

@cchenhaifeng

The local results are correct

The CI environment uses:

* Python 3.8

* paddlepaddle-gpu==2.5

Have you aligned your PaddlePaddle version with the CI environment? We need to support both Paddle 2.5 and Paddle 2.6.

My Python is 3.12

@cchenhaifeng

The local results are correct

The CI environment uses:

* Python 3.8

* paddlepaddle-gpu==2.5

Have you aligned your PaddlePaddle version with the CI environment? We need to support both Paddle 2.5 and Paddle 2.6.

No problem locally

@cchenhaifeng

@cchenhaifeng

@cchenhaifeng

image
All the test code has been verified, I'm getting it to run again, please check @zxcd

@cchenhaifeng

All the test code has been verified, please check @zxcd

zxcd

from .audio_signal import AudioSignal
from .audio_signal import STFTParams
from .loudness import Meter
from paddlespeech.t2s.modules import fft_conv1d

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unified use of relative paths or absolute paths in one file

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

function main(){
set -ex
speech_ci_path=`pwd`
pip install ffmpeg flatten_dict ffmpy

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_audiotools.sh will install these pkg, I think this don't need to add it twice?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -0,0 +1,61 @@
# Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

x, y = get_input()
loss = MultiScaleSTFTLoss()
pd_loss = loss(x, y)
np.allclose(pd_loss.numpy(), 7.5622)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the accuracy be aligned to 1e-6?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

noise = (e_res**2).sum(axis=1)
sdr = -10 * paddle.log10(signal / noise + eps)
if self.clip_min is not None:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whether should also add self.clip_min != 'None'?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@cchenhaifeng

@cchenhaifeng

@cchenhaifeng

ci It has all passed, and there will be errors in the loop of packet routing, so the method of delaying packet routing is used. @zxcd

zxcd

x, y = get_input()
loss = MultiScaleSTFTLoss()
pd_loss = loss(x, y)
np.allclose(pd_loss.numpy(), 7.562150, rtol=1e-06)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you put the code generated using audiotools for 7.562150 in the note for easy verification?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where are the code comments?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image
Here

@cchenhaifeng

@cchenhaifeng

@cchenhaifeng

image
As you can see, the test_util is in a loop after the deletion. @zxcd

zxcd

x, y = get_input()
loss = GANLoss(My_discriminator0())
pd_loss0, pd_loss1 = loss(x, y)
np.allclose(pd_loss0.numpy(), -0.102722, rtol=1e-06)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use self.assertEqual or assert include np.allclose

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use self.assertEqual or assert include np.allclose

Done

@cchenhaifeng

zxcd

zxcd approved these changes Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zxcd zxcd mentioned this pull request

Feb 27, 2025

@luotao1

hi, @cchenhaifeng

@luotao1

@cchenhaifeng 同学你好,第8期黑客松发奖金啦,请尽快联系我们!加运营小姐姐微信 Emma718009,备注 GithubID

Labels