TTS Generation Web UI (Bark, MusicGen + AudioGen, Tortoise, RVC, Vocos, Demucs, SeamlessM4T, MAGNet, StyleTTS2, MMS, Stable Audio, Mars5, F5-TTS, ParlerTTS)
-
Updated
Apr 25, 2025 - TypeScript
TTS Generation Web UI (Bark, MusicGen + AudioGen, Tortoise, RVC, Vocos, Demucs, SeamlessM4T, MAGNet, StyleTTS2, MMS, Stable Audio, Mars5, F5-TTS, ParlerTTS)
(Windows/Linux/MacOS) Local WebUI with neural network models (Text, Image, Video, 3D, Audio) on python (Gradio interface). Translated on 3 languages
ONNX-compatible Fast SeamlessM4T—Massively Multilingual & Multimodal Machine Translation
SeamlessM4t-Translator: Utilizing the powerful Seamless M4t Facebook model in the backend, this project facilitates seamless translation functionalities including S2ST, S2TT, T2ST, and T2TT queries.
Docker image for TTS Generation ALL IN ONE
Turn any LLM into Jarvis
EchoSight is a tool that helps visually impaired individuals by audibly describing images taken with a Raspberry Pi Camera or inputted via image path or URL across different operating systems.
Docker containerized deployable API endpoints for the SeamlessM4Tv2 model generating text-to-speech and speech-to-speech translation audio files.
How I used Seamless m4t large to get to the top 5 of the mozilla common voice competition hosted on Zindi
Automatic speech recognition (ASR)
TTS Generation Web UI (Bark, MusicGen + AudioGen, Tortoise, RVC, Vocos, Demucs, SeamlessM4T, MAGNet, StyleTTS2, MMS, Stable Audio, Mars5, F5-TTS, ParlerTTS)
Translation from one language to another without speech intermediate
Tafsiri is an AI-powered speech-to-speech translation app that uses seamlessM4T and Edge TTS models. It is build using FastAPI, React (+ Vite), Nginx, and Tailwind CSS
Just Run As It. Note: after install package, remember restart kernal
Add a description, image, and links to the seamlessm4t topic page so that developers can more easily learn about it.
To associate your repository with the seamlessm4t topic, visit your repo's landing page and select "manage topics."