Releases · Sharrnah/whispering (original) (raw)

v1.3.18.10

v1.3.18.9

v1.3.18.6

v1.3.18.4

Important:

This requires a lot of configuration if run directly. Recommended way is to use UI Application: https://github.com/Sharrnah/whispering-ui which downloads this automatically.

Standalone Release File (5.0 GB):

Download Server:

Changelog (v1.3.18.4)

[FEATURE] Add zonos hybrid model support
[FEATURE] Add triton jit compiler support
[FEATURE] Add microphone passthrough
[TASK] Remove debug output in orpheus TTS
[TASK] Add call to fix possible CUDNN_STATUS_SUBLIBRARY_VERSION_MISMATCH error
[TASK] Send activity notification while speaking every 2 seconds
[TASK] Print used cuDNN version on startup
[TASK] Simplify realtime translation condition
[TASK] Send timestamps setting to nemo_canary model
[TASK] Optimize sentence splitting for kokoro tts
[TASK] Update transformer library
[TASK] Unify function arguments for kokoro
[TASK] Add mamba_ssm dependency
[TASK] Enhance Zonos TTS with queue, streaming improvements & text splitting
[TASK] Update requirements to working versions
[TASK] Phi4 code improvements
[TASK] Additional cleanup of Phi-4 translation results
[TASK] Check if cuda is available before loading triton backend for zonos tts
[TASK] Add osc_force_activity_indication setting
[TASK] Keep line-breaks when splitting osc messages due to length
[TASK] Add returning of probabilities for language detections
[TASK] Add language code converter
[TASK] Save and load tts specific settings
[BUGFIX] Add unidic dependency for kokoro
[BUGFIX] Run kokoro tts on configured ai device
[BUGFIX] Run orpheus tts on configured ai device
[BUGFIX] Orpheus tts compatibility with newest transformers version
[BUGFIX] Add missing libraries espeakng_loader and unidic_lite
[BUGFIX] espeak usage with Zonos
[BUGFIX] Use configured seed for streamed playback in zonos tts
[BUGFIX] Implement Phi4 patch for transformers versions >= 4.50
[BUGFIX] Correct torchaudio dependency for pytorch version
[BUGFIX] Exit child processes on exit
[BUGFIX] Fix text translation realtime sync
[BUGFIX] Correctly save and load modified dictionaries in settings

Full Changelog: v1.3.17.3...v1.3.18.4

v1.3.17.3

Important:

This requires a lot of configuration if run directly. Recommended way is to use UI Application: https://github.com/Sharrnah/whispering-ui which downloads this automatically.

Standalone Release File (4.3 GB):

Download Server:

Changelog (v1.3.17.3)

[FEATURE] Update pyTorch and flash-attn to NVIDIA 50x Blackwell supporting versions
[FEATURE] Add support for flash and normal canary models
[FEATURE] Add plugin settings reset function.
[FEATURE] Add Orpheus TTS
[FEATURE] Add audio streaming to Orpheus TTS without vllm
[FEATURE] Add delayed start audio streaming
[FEATURE] Implement Voxtral (Speech-to-Text + Text-Translation + LLM)
[TASK] Remove debug output
[TASK] cleanup transformer whisper code
[TASK] Add flash attention 2 to transformer whisper
[TASK] Update zonos library
[TASK] call plugin_tts_after_audio event for streamed playback
[TASK] Improve kokoro tts split_pattern
[TASK] code cleanup
[TASK] Add support for parakeet model
[TASK] Implement BitsAndBytes for Transformer whisper again
[TASK] Add large distilled v3.5 english model to faster whisper
[TASK] Add setting to synchronize txt realtime with stt realtime active
[TASK] Update transformers dependency
[BUGFIX] internal event calling
[BUGFIX] Fix canary code for new nemo library version
[BUGFIX] Fix downloading MMS models
[BUGFIX] Fix pyinstaller build spec

Full Changelog: v1.3.16.2...v1.3.17.3

v1.3.17.2-beta (CUDA 12.8)

v1.3.16.2

Important:

This requires a lot of configuration if run directly. Recommended way is to use UI Application: https://github.com/Sharrnah/whispering-ui which downloads this automatically.

Standalone Release File (4.3 GB):

Download Server:

Changelog (v1.3.16.2)

[FEATURE] Add clear transcriptions function
[FEATURE] select_textvalue and select_completion widgets
[FEATURE] Add Zonos TTS
[FEATURE] Add Phi-4 multimodal model (For STT, Translation, OCR)
[FEATURE] Add function calling from Phi4 model
[FEATURE] Add Kokoro TTS model
[FEATURE] Add new OCR type (GOT OCR 2.0)
[FEATURE] Add chat websocket message for multimodal LLM models like Phi-4mm
[FEATURE] Add support for custom F5-TTS models
[TASK] cache voice list for F5 TTS
[TASK] Update desktop+ documentation
[TASK] Allow streamed audio playback for TTS result
[TASK] Display Torch version at startup
[TASK] Update dependencies
[TASK] Add download_model function to downloader
[TASK] Add TTS stop call
[TASK] Add streamed playback for F5-TTS, Kokoro TTS and Zonos TTS
[TASK] uniffy OCR window capture over all OCR models
[TASK] Encode image data after result sending
[TASK] Add TTS audio normalization setting
[BUGFIX] Streamed audio playback delaying last buffered audio frame
[BUGFIX] Model loading for float16 only models in other precisions
[BUGFIX] Make TTS model classes singletons
[BUGFIX] Add additional punkt dependencies in build
[BUGFIX] Fix osc message if LLM already supports transcription+translation
[BUGFIX] Error if result object has no "text"
[BUGFIX] Wav export with streamed playback enabled
[BUGFIX] Make sure only one download per file is running
[BUGFIX] Add missing dependency for Phi-4 model

Full Changelog: v1.3.15.4...v1.3.16.2

Releases · Sharrnah/whispering (original) (raw)

v1.3.18.10

v1.3.18.9

v1.3.18.6

v1.3.18.4

Important:

This requires a lot of configuration if run directly. Recommended way is to use UI Application: https://github.com/Sharrnah/whispering-ui which downloads this automatically.

Standalone Release File (5.0 GB):

Changelog (v1.3.18.4)

v1.3.17.3

Important:

This requires a lot of configuration if run directly. Recommended way is to use UI Application: https://github.com/Sharrnah/whispering-ui which downloads this automatically.

Standalone Release File (4.3 GB):

Changelog (v1.3.17.3)

v1.3.17.2-beta (CUDA 12.8)

v1.3.16.2

Important:

This requires a lot of configuration if run directly. Recommended way is to use UI Application: https://github.com/Sharrnah/whispering-ui which downloads this automatically.

Standalone Release File (4.3 GB):

Changelog (v1.3.16.2)

v1.3.15.4

v1.3.15.2

v1.3.15.1