Introducing
Your new presentation assistant.
Refine, enhance, and tailor your content, source relevant images, and edit visuals quicker than ever before.
Trending searches
Personalized Voice Cloning Using GANs
Personalized Voice Simulation Using GANs
TEAM MEMBERS:
Sai Akhil K - 1MS16EC087
Vignesh A - 1MS16EC130
A V Phani Koushik - 1MS12EC024
GUIDE:
Dr. S. SETHU SELVI
HOD
Dept. of E&C
RIT, Bangalore
Data sets used
100 Hours of Celebrity voices.
100 hours of clean audio with transcription.
TESS
VCTK
Toronto Emotional Speech Set
109 Native English Speakers
Each speaker reads out about 400 sentences
2 speakers, 200 samples each
Flow Chart
Synthesizer:
Vectors obtained in Encoder used to generate new spectrogram on new text
Vocoder:
Generates spectrogram to TTS output
Encoder:
"Encodes" Speaker's voice into vector embeddings
1. Target Speaker Voice is give as input to Speaker Encoder
2. Text to be converted to voice is given as input to Encoder of Synthesizer
3. The output of the voice of Target Speaker that resents the sentence given as input is received from Vocoder
Architecture:
Functionality:
Training:
Architecture:
Flow:
Training:
Training:
Introduction
Gen Loss after 140 epochs
Avg. Dis Loss for every 10 epochs
Pretrained Outputs can be found here:
https://my-voice-8a71f.web.app/