Loading presentation...

Present Remotely

Send the link below via email or IM

Copy

Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

DeleteCancel

Make your likes visible on Facebook?

Connect your Facebook account to Prezi and let your likes appear on your timeline.
You can change this under Settings & Account at any time.

No, thanks

Copy of Copy of Final Year Project: Voice Recognition system using MATLAB

No description
by

MANOJ S NAIR

on 24 February 2014

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Copy of Copy of Final Year Project: Voice Recognition system using MATLAB

Submitted By :-
Deepika Bansal (RIT2010047)
Sudhanshu Pratap Singh (RIT2010066)
Vratika Ghatiya (RIT2010078)



Speaker Identification And Context Based Search
INTRODUCTION
PROPOSED APPROACH
by :
Yusherizan Marshella Binti Yusoh
(EE084076)
LITERATURE SURVEY
Reference
Problem Definition
SPEAKER IDENTIFICATION &
VERIFICATION
Training
Testing
Result And Analysis
Deliverable
Testing and training on database
Analyze accuracy of gender identification and speech recognition system for the collected database.
Mel Frequency Cepstrum
Coefficient
Pitch [2]
Mel Frequency Cepstrum Coefficient [1]
Principal Component Analysis [3]
Vector Quantization [1]
K-Means Algorithm
Euclidean Distance
Support Vector Machine (SVM) [5]
Multiclass SVM [5]
Dataset
Guided BY:-
Prof.R.C.Tripathi

OVERVIEW
Problem Definition
Literature Survey
Proposed Approach
Result And Analysis
References
1] Mahdi Shaneh and AzizollahTaheri, "Voice Command Recognition System Based on MFCC and VQ algorithms" World Academy of Science, Engineering and Technology 33 2009. page: 3

2] Ms. Arundhati S. Mehendale and Mrs. M.R. Dixit "Speaker Identification" Signals and Image processing: An International Journal (SIPIJ) Vol. 2, No. 2, June 2011. page: 3,4,5

3] J.S Chitode, Anuradha S. Nigade " Throat Microphone Signals forIsolated Word Recognition Using LPC " International Journal of Advanced Research in Computer Science and Software Engineering, Volume 2, Issue 8, August 2012. ISSN: 2277 128X. Page : 4,5

4] Lu-Shih Alex Low, et al., “Content Based Clinical Depression Detection in Adolescents”, 17th EUSIPCO 2009, Scotland Aug. 24-28, 2009.

5] C. Thanawattano and S. Tan-a-ram, “Cardiac arrhythmia detection based on signal variation characteristic”, BMEI2008, Hainan, China, 2008.
page : 2,4,5,6

Speaker Recognition
The task of speaker recognition is to determine the identity of a speaker by machine. Speaker recognition is divided in two parts:
Identification
In speaker identification the speaker can be identified by his voice.
Verification
In case of speaker verification the speaker is verified using Database.
Gender Identification
Project also determines the gender of the speaker.
Extracts features from the speech signal
to create the fingerprint of the sound files.

Training
Testing
Take the input speech.
Create the fingerprint of the audio file using MFCC feature extraction technique.
Apply PCA on to MFCC features to extract the most significant components of feature.
Divide dataset in two parts : 80% training and 20% testing.
Train the SVM using training data set.
Cluster the data features to form a codebook for each speaker using vector quantization. (for Euclidean Distance)
Take the input speech.
Create the fingerprint of the audio file using MFCC feature extraction technique.
Apply PCA on to MFCC features to extract the most significant components of feature.
Identify unknown speaker using SVM classification.
Identify unknown speaker using the Euclidean distance. Measure the distortion distance of speaker feature vector with each codebook. Codebook with minimum distance will be the unknown speaker’s codebook.

Content Based Search
GENDER RECOGNITION
Possible Extension
Content based similarity match of new audio file
from database.
Speaker Recognition
Gender Detection
Gender Detection
Average male pitch = 238.6145

Average female pitch = 253.2257

Total User = 76
Male Classified Correctly = 34/45
Female Classified Correctly = 34/45
Total correctly Classified = 56/76
Accuracy = 73.68 %

Speaker Identification
1) Varying Frame size


2) Varying Frame overlap


3) Varying No of Mel-filter
Result Cont...
No of Mel-filter = 20
Frame overlap = 2/3

Analysis cont...
N = 128

High resolution of time
Each frame lasts a very short period of time
Poor frequency resolution.

N = 256
A compromise between the resolution in time and the frequency resolution.

N = 512
Excellent frequency resolution
Lesser frames, meaning that the resolution in time is strongly reduced.
Number of frames is relatively small, which will reduce computing time.

Result Cont...
Frame size = 256 sample per frame
No of Mel-Filter = 20


Result Cont...
Frame size = 256 sample per frame
Frame overlap = 2/3


Signal Processing
Signal Plot at 8000 Hz, 16bit
Power Plot Of Frames
Power Spectrum After Mel Cepstrum
Vector Quantization
Continue..
Sphinx

Lucene
Full transcript