【Hackathon 8th No.9】在 PaddleSpeech 中复现 DAC 训练需要用到的 loss by cchenhaifeng · Pull Request #3988 · PaddlePaddle/PaddleSpeech (original) (raw)
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Conversation67 Commits8 Checks2 Files changed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
Thanks for your contribution!
All committers have signed the CLA.
| "DDSP: Differentiable Digital Signal Processing." |
|---|
| International Conference on Learning Representations. 2019. |
| Implementation copied from: https://github.com/descriptinc/lyrebird-audiotools/blob/961786aa1a9d628cca0c0486e5885a457fe70c1a/audiotools/metrics/spectral.py |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why don't use develop branch?
| def test_multi_scale_stft_loss(): |
| a = np.linspace(0, 199.98, 10000) |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggest use true wav to test loss.
| def test_sisdr_loss(): |
| a = np.linspace(0, 199.98, 10000) |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same with above
| def __call__(self, x): |
|---|
| return x * (-0.2) |
| a = np.linspace(0, 199.98, 10000) |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same with above
| weight : float, optional |
|---|
| Weight of this loss, defaults to 1.0. |
| Implementation copied from: https://github.com/descriptinc/lyrebird-audiotools/blob/961786aa1a9d628cca0c0486e5885a457fe70c1a/audiotools/metrics/distance.py |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same with above
| pow: float=2.0, |
|---|
| weight: float=1.0, |
| match_stride: bool=False, |
| window_type: str=None, ): |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when default value is None, suggest use Optional type
| def __init__( |
| self, |
| scaling: int=True, |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int=True seems somewhat ambiguous
| x: Union[AudioSignal, paddle.Tensor], |
|---|
| y: Union[AudioSignal, paddle.Tensor]): |
| eps = 1e-8 |
| # nb, nc, nt |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you mean tensor shape, suggest use (B, C, T)
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have modified it as requested. Please check it.
Are you interested in participating in the training of the DAC model (Hackathon 8th No.5) as well?
Are you interested in participating in the training of the DAC model (Hackathon 8th No.5) as well?
我看一下?
任务五似乎是要先把任务9完成才能做的?那我是先等任务9合进去再开始嘛? @zxcd
是的,请先完成任务9
啊,任务9已经完成可以检查了的。
| def get_input(): |
| x = np.array([ |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我觉得应该找一下做audiotools这个pr的人复核一下,他的导包似乎有问题。 @luotao1
我觉得应该找一下做audiotools这个pr的人复核一下,他的导包似乎有问题
也请 @DrRyanHuang 有空看下
我觉得应该找一下做audiotools这个pr的人复核一下,他的导包似乎有问题。
@cchenhaifeng
Currently, since audiotools exists independently within PaddleSpeech, its installation and testing are also conducted separately. If you wish to use audiotools, you can install it yourself by following the requirements specified in paddlespeech/audiotools/requirements.txt.
This is demonstrated in the testing of audiotools:
https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/tests/unit/audiotools/test_audiotools.sh#L1
I previously encountered the issue of cyclic imports as well. In this PR, you can fix it using deferred imports; I see that you’ve already done so. If there are any other issues, we can communicate at any time.
我觉得应该找一下做audiotools这个pr的人复核一下,他的导包似乎有问题。
@cchenhaifeng Currently, since
audiotoolsexists independently withinPaddleSpeech, its installation and testing are also conducted separately. If you wish to useaudiotools,you can install it yourself by following the requirements specified inpaddlespeech/audiotools/requirements.txt.This is demonstrated in the testing of
audiotools: https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/tests/unit/audiotools/test_audiotools.sh#L1I previously encountered the issue of cyclic imports as well. In this PR, you can fix it using deferred imports; I see that you’ve already done so. If there are any other issues, we can communicate at any time.
Okay, I'll get back to you if I have any other questions
我觉得应该找一下做audiotools这个pr的人复核一下,他的导包似乎有问题。
@cchenhaifeng Currently, since
audiotoolsexists independently withinPaddleSpeech, its installation and testing are also conducted separately. If you wish to useaudiotools,you can install it yourself by following the requirements specified inpaddlespeech/audiotools/requirements.txt.This is demonstrated in the testing of
audiotools: https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/tests/unit/audiotools/test_audiotools.sh#L1I previously encountered the issue of cyclic imports as well. In this PR, you can fix it using deferred imports; I see that you’ve already done so. If there are any other issues, we can communicate at any time.

Is it normal for the test to get stuck here all the time?
Is it normal for the test to get stuck here all the time?
Well, it's not normal. Does the code pass the unit tests for utils.py on your local machine? If it does, then the issue might be with the CI system.
Is it normal for the test to get stuck here all the time?
Well, it's not normal. Does the code pass the unit tests for
utils.pyon your local machine? If it does, then the issue might be with the CI system.
I haven't tried this, I'm testing test_losses.py , but I'm just modifying the packet method, not the logic diamagnetic, and theoretically it should pass.
Is it normal for the test to get stuck here all the time?
Well, it's not normal. Does the code pass the unit tests for
utils.pyon your local machine? If it does, then the issue might be with the CI system.
The local results are correct
The local results are correct
The CI environment uses:
- Python 3.8
- paddlepaddle-gpu==2.5
Have you aligned your PaddlePaddle version with the CI environment?
We need to support both Paddle 2.5 and Paddle 2.6.
The local results are correct
The CI environment uses:
* Python 3.8 * paddlepaddle-gpu==2.5Have you aligned your PaddlePaddle version with the CI environment? We need to support both Paddle 2.5 and Paddle 2.6.
My Python is 3.12
The local results are correct
The CI environment uses:
* Python 3.8 * paddlepaddle-gpu==2.5Have you aligned your PaddlePaddle version with the CI environment? We need to support both Paddle 2.5 and Paddle 2.6.
No problem locally

All the test code has been verified, I'm getting it to run again, please check @zxcd
All the test code has been verified, please check @zxcd
| from .audio_signal import AudioSignal |
|---|
| from .audio_signal import STFTParams |
| from .loudness import Meter |
| from paddlespeech.t2s.modules import fft_conv1d |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unified use of relative paths or absolute paths in one file
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| function main(){ |
|---|
| set -ex |
| speech_ci_path=`pwd` |
| pip install ffmpeg flatten_dict ffmpy |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test_audiotools.sh will install these pkg, I think this don't need to add it twice?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| @@ -0,0 +1,61 @@ |
|---|
| # Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved. |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2025
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| x, y = get_input() |
|---|
| loss = MultiScaleSTFTLoss() |
| pd_loss = loss(x, y) |
| np.allclose(pd_loss.numpy(), 7.5622) |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the accuracy be aligned to 1e-6?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| noise = (e_res**2).sum(axis=1) |
|---|
| sdr = -10 * paddle.log10(signal / noise + eps) |
| if self.clip_min is not None: |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whether should also add self.clip_min != 'None'?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
ci It has all passed, and there will be errors in the loop of packet routing, so the method of delaying packet routing is used. @zxcd
| x, y = get_input() |
|---|
| loss = MultiScaleSTFTLoss() |
| pd_loss = loss(x, y) |
| np.allclose(pd_loss.numpy(), 7.562150, rtol=1e-06) |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you put the code generated using audiotools for 7.562150 in the note for easy verification?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where are the code comments?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.

As you can see, the test_util is in a loop after the deletion. @zxcd
| x, y = get_input() |
|---|
| loss = GANLoss(My_discriminator0()) |
| pd_loss0, pd_loss1 = loss(x, y) |
| np.allclose(pd_loss0.numpy(), -0.102722, rtol=1e-06) |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use self.assertEqual or assert include np.allclose
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use
self.assertEqualorassertincludenp.allclose
Done
zxcd approved these changes Feb 26, 2025
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
zxcd mentioned this pull request
hi, @cchenhaifeng
- 非常感谢你对飞桨的贡献,我们正在运营一个PFCC组织。PFCC是飞桨开源的贡献者俱乐部,只有给飞桨合入过代码的开发者才能加入,俱乐部里每两周会有一次例会(按兴趣参加),也会时不时办线下meetup面基,详情可见 https://github.com/luotao1 主页说明。
- 如果你对PFCC有兴趣,请发送邮件至 ext_paddle_oss@baidu.com,我们会邀请你加入~
@cchenhaifeng 同学你好,第8期黑客松发奖金啦,请尽快联系我们!加运营小姐姐微信 Emma718009,备注 GithubID


