Speech Hub
Audio-Centric business management platform
Keywords: Automatic Speech Recognition(ASR), Summarizer, Keyword Extraction, Promotional Content Extraction, LLM
Brief Description
SpeechHub is a sophisticated productivity tool powered by Automatic Speech Recognition (ASR) technology, supporting both Bengali and English. It allows users to upload any audio conversation recorded or live, and transcribes it into a conversational format. Additionally, it provides summarization and detects keywords and speakers.
This application efficiently summarizes business meetings, identifies speakers, and organizes dialogues into a clear conversational format. It also pinpoints important business keywords, making information easy to find.
Key feature of SpeechHub:
- Incredibly low-latency (approximately 50s for 1 hour of audio) Audio Transcription in Bengali and English languages
- Speaker Diarization and dialogue-style conversation generation from audio data
- Summarization of meetings and conversations
- Mentioned keywords detection and frequency count
Key Technologies used in SpeechHub:
- whisper based Automatic Speech Recognition (
ASR) model for both Bangla and English audio transcription - ntegration with PyAnnote based speaker diarization for dialogue style conversation generation.
- BERT based dialogue summarization pipeline
- Integration of
FastAPIbased devOps system andSQLbased database system for seamless usage.
Language/Framework: Python 3.9, PyTorch
Simple illustration of the project
N.B: The code for this project can’t be made public for propritory reasons
Collaborators:
1. A F M Mahfuzul Kabir
2. Sawradip Saha