Skip to content

This project automates audio processing by removing silence, transcribing speech to text, and storing the output in an SQLite database. It supports multiple audio formats and leverages Google Speech Recognition for high accuracy.

License

Notifications You must be signed in to change notification settings

Amin-moniry-pr7/Speech-to-Text-Transcription

Repository files navigation

🎤🚀 AUDIO PROCESSING & TRANSCRIPTION PROJECT 🎶🔊


\


📌 PROJECT OVERVIEW

🔥 AUTOMATES AUDIO PROCESSING WITH HIGH ACCURACY 🔥
✔️ REMOVES SILENCE & CLEANLY PROCESSES AUDIO FILES.
✔️ TRANSCRIBES SPEECH USING GOOGLE SPEECH RECOGNITION.
✔️ STORES OUTPUT IN AN SQLITE DATABASE.
✔️ SUPPORTS MULTIPLE AUDIO FORMATS: WAV, MP3, M4A, OGG, FLAC.


FEATURES AT A GLANCE

SILENCE REMOVAL – AUTOMATICALLY DETECTS & REMOVES SILENCE.
SPEECH-TO-TEXT – AI-POWERED TRANSCRIPTION FOR ACCURACY.
SQLITE DATABASE INTEGRATION – STORES PROCESSED FILES, DURATIONS & TRANSCRIPTIONS.
MULTIPLE AUDIO FORMATS SUPPORTEDWAV, MP3, M4A, OGG, FLAC.
MULTITHREADED PROCESSINGFASTER AUDIO HANDLING.


🏗 PROJECT STRUCTURE

📁 AUDIO_PROCESSING_PROJECT/
│── 🎵 CONVERT_AUDIO_TO_TEXT_AND_REMOVE_SILENCE.py  # MAIN SCRIPT  
│── 📜 DATABASE_AND_PREPARE_AUDIO.py  # DATABASE OPERATIONS  
│── 🔊 REMOVE_SILENCE_AND_MEASURE.py  # SILENCE REMOVAL & DURATION MEASUREMENT  
│── 🎙 SPEECH_AND_TRANSCRIBE.py  # SPEECH-TO-TEXT PROCESSING  
│── 📜 REQUIREMENTS.TXT  # DEPENDENCIES  

INSTALLATION & USAGE

1️⃣ CLONE THE REPOSITORY

git clone https://github.com/Amin-moniry-pr7/Telegram_Translator_Bot.git
cd Telegram_Translator_Bot

2️⃣ INSTALL DEPENDENCIES

pip install -r requirements.txt

3️⃣ RUN THE SCRIPT

python CONVERT_AUDIO_TO_TEXT_AND_REMOVE_SILENCE.py

4️⃣ INPUT REQUIREMENTS

🔹 ENTER THE AUDIO FILE PATH (WAV, MP3, M4A, OGG, FLAC).
🔹 SPECIFY LANGUAGE CODE (E.G., EN-US).
🔹 SET MINIMUM SILENCE LENGTH & SILENCE THRESHOLD.


📂 GENERATED FILES

🎵 ORIGINAL AUDIO: AMIN_1.WAV
🔇 PROCESSED AUDIO (NO SILENCE): AMIN_1_NO_SILENCE.WAV
📜 TRANSCRIPTION OUTPUT: STORED IN AMIN_TEXT


🗃 DATABASE STRUCTURE (SQLITE - PODCAST.DB)

ID INPUT FILE PROCESSED FILE LANGUAGE ORIGINAL DURATION PROCESSED DURATION TRANSCRIPTION TIMESTAMP
1 AMIN_1.WAV AMIN_1_NO_SILENCE.WAV EN-US 60s 45s "HELLO, THIS IS A TEST..." 2025-02-10

🎯 FUTURE IMPROVEMENTS

🚀 ADD A USER-FRIENDLY GRAPHICAL INTERFACE (GUI)
📡 SUPPORT REAL-TIME AUDIO STREAMING
🧠 ENHANCE AI-BASED NOISE REDUCTION


📜 LICENSE

🔖 LICENSED UNDER CREATIVE COMMONS ATTRIBUTION-NONCOMMERCIAL 4.0 INTERNATIONAL.

💡 DEVELOPED BY: AMIN MONIRY

🎤 HAPPY CODING & AUDIO PROCESSING! 🚀🎶

I HOPE  , THIS WILL BE USEFULL FOR YOURSELF

About

This project automates audio processing by removing silence, transcribing speech to text, and storing the output in an SQLite database. It supports multiple audio formats and leverages Google Speech Recognition for high accuracy.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages