Send the link below via email or IMCopy
Present to your audienceStart remote presentation
- Invited audience members will follow you as you navigate and present
- People invited to a presentation do not need a Prezi account
- This link expires 10 minutes after you close the presentation
- A maximum of 30 users can follow your presentation
- Learn more about this feature in our knowledge base article
Do you really want to delete this prezi?
Neither you, nor the coeditors you shared it with will be able to recover it again.
Make your likes visible on Facebook?
You can change this under Settings & Account at any time.
Copy of Copy of Copy of Rescisão Direta dos Contratos de Trabalho
Transcript of Copy of Copy of Copy of Rescisão Direta dos Contratos de Trabalho
(or Front End processing) the first stage of any speech processing system Speech Data Collection We represent the system
of speech analysis like human peripheral auditory system Modeling without having sufficient speech for each accent,
we can't do analysis and find the best models
for the considered Palestinian accents Speech Categories unvoiced voiced Frequency Exist
More Energy No Frequency Exist
Less Energy Arabic Phoneme set The phoneme is defined as the smallest unit in
the sound system of a language, and a group of phonemes generate a word Some Features
For Palestinian Accents
Kashkasha : turn /K/ character to /CH/
example : ( keef cheef )
The pronunciation of the end of the word
/IH/ or /AH/
example : ( madrasih , madrasah )
Turn /Q/ to /K/
example : ( qalam kalam )
Turn /Q/ to /E/
example : ( qalam alam ) Palestinian Accents Recognition Words Collected .... Books & Dictionaries Social Websites Linguistics & Volunteers Additional Features >>> Speakers from Hebron prolong added a little period for pronunciation some characters Some Speakers
emphasis on some
characters speakers from
Jerusalem used a Dumma
in the beginning and the
middle of some nouns
and verbs and other
speakers used Kasra In the Palestinian community, accents reflect
the socio-economic status of its speakers, their
ethnicity, and their caste or social class Nablus Hebron AL-Bireh Qalilya Tulkarm Jerusalem Al-Lud Speech Data
Collection Process Recording Process read the two scenarios
as naturally as possible
then record a freely speech Devices and tools used in recording ”Superior notes recording” from Philips Recording was saved in a WAV speech format. The bit rate is 1411 kbps, and the sample rate is 44.1 KHz Our collected speech 200 speakers (different ages & genders) 15 hours speech 300 wave files It is used to convert the
speech signal to a suitable
representation for application
such as speech recognition,
and accent recognition. We chose Mel Frequency Cepstral
Coefficients (MFCC), feature
extraction techniques Front-end Implementation Front end processing
of our system MFCC speech analysis passing through
many steps : In Modeling stage training data set is used to create a model of each class to be recognized modeling techniques This modeling process passes through two main steps clustering using k-means algorithm Expectation- Maximization (EM)
@ Dealing with people from different cultures
@ Finding a sufficient number of people for each accent needs a lot of effort.
@Visiting the cities for the first time .
@ Finding suitable places without noise for recording process. challenges
speech data collection Conclusion improve the performance more
build a useful application Planning our project Outline ... * Introduction
* Data speech collection
* Feature Extraction
* Experiments & Results
* Conclusion Related Work The intra-language variation
due to different accents and
dialects is one of the major
difficulties which degrade the performance of the Automatic Speech Recognition (ASR). Project Aim develop and implement an automatic
Palestinian accent recognition system from speech signal. Shahada Sammour
Fatima Sabbah Dr. Abualseoud
Hanani In our initial system we have chosen
GMM method for modeling different accents. Implementation Our state-of-the-art accents recognition systems consists of :
Gaussian Mixture Model - Universal Background Model ---> GMM-UBM
Gaussian Mixture Model - Support Vector Machine ---> GMM-SVM
Phone Recognition followed by Language Model ---> PRLM GMM-UBM Accents Recognition system
We need a large amount of data to build a model for each accent
and due to insufficient data we create a UBM model
which is model each speaker in terms of their difference from the UBM GMM-SVM Accents Recognition system Because the acoustic signals for each speaker are not enough to train GMM, we use the super vectors extracted from map adaptation for GMM_UBM parameters as input for SVM learning Phonotactic system Phonotactic approach is an important part of most language and accent identification’s state of the art.
Every spoken language has its own set of sounds (phonemes) which exist in some sort of relationship to each other, and each phoneme helps in shaping the contours and boundaries of its neighbors . It is composed of many stages which are:
building accent model with SVM . Accents Recognition Experiments In order to evaluate our system, some of the collected speech files are kept hidden and not used in training the system
These testing files are then used to evaluate the system by extracting feature vectors and score them against each accent-specific model GMM-UBM acoustic system The initial experiment was done ......
Which includes MFCC
extraction for testing and training files ....
training a UBM model on
all training data GMM of 64 mixtures
initial centers randomly by using 7 iteration of Kmeans algorithm
4 iteration for EM algorithm
Map adaptation used for each accent
finally the evaluation for each test file is finished We Used : In our initial experiment we didn’t use any
technique to remove the noise and silence. comparison for different
kind of features with GMM-UBM system Confusion matrix for the acoustic GMM-UBM system Accuracy was 56% GMM-SVM acoustic system The target and foreground were found by taking
each file to estimate the mean of GMM using the
After that, we collected the means into one super vector, which is used for SVM training.
In this system we take the same features that we used in the UBM . Accuracy was 75% Confusion matrix for the acoustic GMM-SVM system Phonotactic System In our phonotactic accents recognition system we used four existing phone recognizers (Czech, Hungarian, English, and Russian)
First, phones were extracted from each wave file for each accent. Then for each accent bi-gram LM was built with SVM. confusion matrix – Czech phonotactic system Accuracy was 60% PRLMs Accuracy FUSING Accent recognition systems typically use several acoustic and phonotactic subsystems.
In order to improve the performance of our system, outputs of subsystems are combined. Human Recognition System To judge the result of previous performance of
our accents recognition system that is good ,
we should compared that result with human
listeners performance. we built a web site, and we put the same test
wave files that were used for automatic
classification to listen to them and choose
the most suitable accent for each file. We make this test for 22 listener and each of them nearly listen to approximately 15 sample. Confusion matrix for Human experiments decisions Accuracy was 59% Summarized Results The aim of this project proves that we can recognize Palestinian accents from short speech. The best performance we got is 80%. Acknowledgments First of all, we have to thank Allah, his kindness and mercy gave us the best support ever in all steps of our life
Thanks for Teachers and Doctors, who gave us a big assistance, guidance and support, all the time….
Special thanks and respects for our supervisor
Dr. Abualsoud Hanani, due to his big encouragement,
patience, and advices, in every step of this project.
And we cant forget all those whom we love and they were wonderful supporters for us, our great parents, our brothers and sisters, our relatives and our friends. IF We Have enough time we intend to >> Binary classifier
Goal of the SVM
Cost of error
Classification between with more than two categories. Support Vector Machine Prepared By: Supervised By: