Custom API endpoints for generating text-to-speech and speech-to-speech translation audio files, ready for deployment.
SeamlessM4T-v2 is a collection of models designed to provide high quality translation, allowing people from different linguistic communities to communicate effortlessly through speech and text.
Begin by cloning the repository onto your host machine using
git clone https://github.com/as9219/SeamlessM4Tv2-API.git
Open your preferred IDE and make sure to have the latest version of Python installed. Next, install all the requirments and run locally using
pip install -r requirements.txt
For containerization, download the latest version of Docker and install it on your machine.
Make sure Docker is running. You can Docker is running by using:
ps -ef | grep docker
docker info
Now, we are ready to build our first image version! Open a terminal window in the project directory and use the following command:
docker build -t SeamlessAPI:v1.0 .
Docker will now build the source code into an image. This process will take ~12 minutes
depending on your machine.
Once this image is built, we can now peoceed to building the container using:
docker run [OPTIONS] -p 8080:8080 --name seamlessapi_container SeamlessAPI:v1.0
You can have the following options for building the container:
[OPTIONS] | About |
---|---|
-d |
Runs in detached mode, will not display any logs in terminal |
--privileged |
Runs container with heightened privileges |
--gpus all |
Use if you have GPUs available in your host machine for the container to use |
Docker builds the container and now our endpoints are ready for querying!
There are the following endpoints
- T2S
- S2S
curl -X POST -H "Content-Type: application/json" -d '{"text": "Hello World, I am making a text to speech curl command!", "src_lang": "eng", "tgt_lang": "fra"}' http://localhost:8080/t2s --output path/to/outputdir/output.wav
curl -X POST -H "Content-Type: multipart/form-data" -F "file=path/to/audio_file.wav" -F "tgt_lang=eng" http://localhost:8080/s2s --output path/to/outputdir/output.wav
Invoke-WebRequest -Uri "http://localhost:8080/t2s" -Method Post `
-ContentType "application/json" `
-Body '{"text": "Hello World, I am making a text to speech cli command!", "src_lang": "eng", "tgt_lang": "fra"}' `
-OutFile "path\to\outputdir\output.wav"
Invoke-WebRequest -Uri "http://localhost:8080/s2s" -Method Post `
-Form @{
file = Get-Item 'path\to\audio_file.wav'
tgt_lang = 'eng'
} `
-OutFile "path\to\outputdir\output.wav"
Name | Importance | Description |
---|---|---|
text | required |
Any text you would like to convert |
src_lang | optional |
default: eng |
tgt_lang | optional |
default: fra |