Adding WavLM implementation by jiamingkong · Pull Request #3242 · PaddlePaddle/PaddleSpeech (original) (raw)

[TTS]add Diffsinger with opencpop dataset (#3005)
Update requirements.txt
fix vits reduce_sum's input/output dtype, test=tts (#3028)
[TTS] add opencpop PWGAN example (#3031)
add opencpop voc, test=tts
soft link
Update textnorm_test_cases.txt
[TTS] add opencpop HIFIGAN example (#3038)
add opencpop voc, test=tts
soft link
add opencpop hifigan, test=tts
update
fix dtype diff of last expand_v2 op of VITS (#3041)
[ASR]add squeezeformer model (#2755)
add squeezeformer model
change CodeStyle, test=asr
change CodeStyle, test=asr
fix subsample rate error, test=asr
merge classes as required, test=asr
change CodeStyle, test=asr
fix missing code, test=asr
split code to new file, test=asr
remove rel_shift, test=asr
Update README.md
Update README_cn.md
Update README.md
Update README_cn.md
Update README.md
fix input dtype of elementwise_mul op from bool to int64 (#3054)
[TTS] add svs frontend (#3062)
[TTS]clean starganv2 vc model code and add docstring (#2987)
clean code
add docstring
[Doc] change define asr server config to chunk asr config, test=doc (#3067)
Update README.md
Update README_cn.md
get music score, test=doc (#3070)
[TTS]fix elementwise_floordiv's fill_constant (#3075)
fix elementwise_floordiv's fill_constant
add float converter for min_value in attention
fix paddle2onnx's install version, install the newest paddle2onnx in run.sh (#3084)
[TTS] update svs_music_score.md (#3085)
rm unused dep, test=tts (#3097)
Update bug-report-tts.md (#3120)
[TTS]Fix VITS lite infer (#3098)
[TTS]add starganv2 vc trainer (#3143)
add starganv2 vc trainer
fix StarGANv2VCUpdater and losses
fix StarGANv2VCEvaluator
add some typehint
[TTS]【Hackathon + No.190】 + 模型复现：iSTFTNet (#3006)
iSTFTNet implementation based on hifigan, not affect the function and execution of HIFIGAN
modify the comment in iSTFT.yaml
add the comments in hifigan
iSTFTNet implementation based on hifigan, not affect the function and execution of HIFIGAN
modify the comment in iSTFT.yaml
add the comments in hifigan
add iSTFTNet.md
modify the format of iSTFTNet.md
modify iSTFT.yaml and hifigan.py
Format code using pre-commit
modify hifigan.py,delete the unused self.istft_layer_id , move the self.output_conv behind else, change conv_post to output_conv
update iSTFTNet_csmsc_ckpt.zip download link
modify iSTFTNet.md
modify hifigan.py and iSTFT.yaml
modify iSTFTNet.md
add function for generating srt file (#3123)
add function for generating srt file

在原来websocket_client.py的基础上，增加了由wav或mp3格式的音频文件生成对应srt格式字幕文件的功能

add function for generating srt file

在原来websocket_client.py的基础上，增加了由wav或mp3格式的音频文件生成对应srt格式字幕文件的功能

keep origin websocket_client.py

恢复原本的websocket_client.py文件

add generating subtitle function into README
add generate subtitle funciton into README
add subtitle generation function
add subtitle generation function
fix example/aishell local/train.sh if condition bug, test=asr (#3146)
fix some preprocess bugs (#3155)
add amp for U2 conformer.
fix scaler save
fix scaler save and load.
mv scaler.unscale_ blow grad_clip.
[TTS]add StarGANv2VC preprocess (#3163)
[TTS] [黑客松]Add JETS (#3109)
Update quick_start.md (#3175)
[BUG] Fix progress bar unit. (#3177)
Update quick_start_cn.md (#3176)
[TTS]StarGANv2 VC fix some trainer bugs, add add reset_parameters (#3182)
VITS learning rate revised, test=tts
VITS learning rate revised, test=tts
[s2t] mv dataset into paddlespeech.dataset (#3183)
mv dataset into paddlespeech.dataset
add aidatatang
fix import
Fix some typos. (#3178)
[s2t] move s2t data preprocess into paddlespeech.dataset (#3189)
move s2t data preprocess into paddlespeech.dataset
avg model, compute wer, format rsl into paddlespeech.dataset
fix format rsl
fix avg ckpts
Update pretrained model in README (#3193)
[TTS]Fix losses of StarGAN v2 VC (#3184)
VITS learning rate revised, test=tts
VITS learning rate revised, test=tts
add new aishell model for better CER.
add readme
[s2t] fix cli args to config (#3194)
fix cli args to config
fix train cli
Update README.md
[ASR] Support Hubert, fintuned on the librispeech dataset (#3088)
librispeech hubert, test=asr
librispeech hubert, test=asr
hubert decode
review
copyright, notes, example related
hubert cli
pre-commit format
fix conflicts
fix conflicts
doc related
doc and train config
librispeech.py
support hubert cli
[ASR] fix asr 0-d tensor. (#3214)
Update README.md
Update README.md
fix: 🐛 修复服务端 python ASREngine 无法使用conformer_talcs模型 (#3230)
fix: 🐛 fix python ASREngine not pass codeswitch
docs: 📝 Update Docs
修改模型判断方式
Adding WavLM implementation
fix model m5s
Code clean up according to comments in #3242
fix error in tts/st
Changed the path for the uploaded weight
Update phonecode.py

固话的正则错误修改

参考https://github.com/speechio/chinese_text_normalization/blob/master/python/cn_tn.py 固化的正则为： pattern = re.compile(r"\D((0(10|2[1-3]|[3-9]\d{2})-?)?[1-9]\d{6,7})\D")

Adapted wavlmASR model to pretrained weights and CLI
Changed the MD5 of the pretrained tar file due to bug fixes
Deleted examples/librispeech/asr5/format_rsl.py
Update released_model.md
Code clean up for CIs
Fixed the transpose usages ignored before
Update setup.py
refactor mfa scripts
Final cleaning; Modified SSL/infer.py and README for wavlm inclusion in model options
updating readme and readme_cn
remove tsinghua pypi
Update setup.py (#3294)
Update setup.py
refactor rhy
fix ckpt
add dtype param for arange API. (#3302)
add scripts for tts code switch
add t2s assets
more comment on tts frontend
fix librosa==0.8.1 numpy==1.23.5 for paddleaudio align with this version
move ssl into t2s.frontend; fix spk_id for 0-D tensor;
add ssml unit test
add en_frontend file
add mix frontend test
fix long text oom using ssml; filter comma; update polyphonic
remove print
hotfix english G2P
en frontend unit text
fix profiler (#3323)
old grad clip has 0d tensor problem, fix it (#3334)
update to py3.8
remove fluid.
add roformer
fix bugs
add roformer result
support position interpolation for langer attention context windown length.
RoPE with position interpolation
rope for streaming decoding
update result
fix rotary embeding
Update README.md
fix weight decay
fix develop view confict with model's
Add XPU support for SpeedySpeech (#3502)
Add XPU support for SpeedySpeech
fix typos
update description of nxpu
Add XPU support for FastSpeech2 (#3514)
Add XPU support for FastSpeech2
optimize
Update ge2e_clone.py (#3517)

修复在windows上的多空格错误

Fix Readme. (#3527)
Update README.md
Update README_cn.md
Update README_cn.md
Update README.md
FIX: Added missing imports
FIX: Fixed the implementation of a special method
【benchmark】add max_mem_reserved for benchmark (#3604)
fix profiler
add max_mem_reserved for benchmark
fix develop bug function:view to reshape (#3633)
【benchmark】fix gpu_mem unit (#3634)
fix profiler
add max_mem_reserved for benchmark
fix benchmark
增加文件编码读取 (#3606)

Fixed #3605

bugfix: audio_len should be 1D, no 0D, which will raise list index out (#3490)

of range error in the following decode process

Co-authored-by: Luzhenhui luzhenhui@mqsz.com

Update README.md (#3532)

Fixed a typo

fixed version for paddlepaddle. (#3701)
fixed version for paddlepaddle.
fix code style
【Fix Speech Issue No.5】issue 3444 transformation import error (#3779)
fix paddlespeech.s2t.transform.transformation import error
fix paddlespeech.s2t.transform import error
【Fix Speech Issue No.8】issue 3652 merge_yi function has a bug (#3786)
【Fix Speech Issue No.8】issue 3652 merge_yi function has a bug
【Fix Speech Issue No.8】issue 3652 merge_yi function has a bug
【test】add cli test readme (#3784)
add cli test readme
fix code style
【test】fix test cli bug (#3793)
add cli test readme
fix code style
fix bug
Update setup.py (#3795)
adapt view behavior change, fix KeyError. (#3794)
adapt view behavior change, fix KeyError.
fix readme demo run error.
fixed opencc version

Co-authored-by: liangym 34430015+lym0302@users.noreply.github.com Co-authored-by: TianYuan white-sky@qq.com Co-authored-by: 夜雨飘零 yeyupiaoling@foxmail.com Co-authored-by: zxcd 228587199@qq.com Co-authored-by: longRookie 68834517+longRookie@users.noreply.github.com Co-authored-by: twoDogy 128727742+twoDogy@users.noreply.github.com Co-authored-by: lemondy lemondy9@gmail.com Co-authored-by: ljhzxc 33015549+ljhzxc@users.noreply.github.com Co-authored-by: PiaoYang 495384481@qq.com Co-authored-by: WongLaw mailoflawrence@gmail.com Co-authored-by: Hui Zhang zhtclz@foxmail.com Co-authored-by: Shuangchi He 34329208+Yulv-git@users.noreply.github.com Co-authored-by: TianHao Zhang 32243340+Zth9730@users.noreply.github.com Co-authored-by: guanyc guanyc@gmail.com Co-authored-by: jiamingkong kinetical@live.com Co-authored-by: zoooo0820 zoooo0820@qq.com Co-authored-by: shuishu 990941859@qq.com Co-authored-by: LixinGuo 18510030324@126.com Co-authored-by: gmm 38800877+mmglove@users.noreply.github.com Co-authored-by: Wang Huan wanghuan29@baidu.com Co-authored-by: Kai Song 50285351+USTCKAY@users.noreply.github.com Co-authored-by: skyboooox zcj924@gmail.com Co-authored-by: fazledyn-or ataf@openrefactory.com Co-authored-by: luyao-cv 1367355728@qq.com Co-authored-by: Color_yr 402067010@qq.com Co-authored-by: JeffLu luzhenhui@gmail.com Co-authored-by: Luzhenhui luzhenhui@mqsz.com Co-authored-by: satani99 42287151+satani99@users.noreply.github.com Co-authored-by: mjxs 52824616+kk-2000@users.noreply.github.com Co-authored-by: Mattheliu leonliuzx@outlook.com

Adding WavLM implementation by jiamingkong · Pull Request #3242 · PaddlePaddle/PaddleSpeech (original) (raw)

固话的正则 错误修改

固话的正则错误修改