Send the link below via email or IMCopy
Present to your audienceStart remote presentation
- Invited audience members will follow you as you navigate and present
- People invited to a presentation do not need a Prezi account
- This link expires 10 minutes after you close the presentation
- A maximum of 30 users can follow your presentation
- Learn more about this feature in our knowledge base article
Transcript of TPT17
TPT17 Multimedia Indexing and Retrieval
Sinusoidal Peak Estimation
Our task was to discover a question in the sequence of the audio track based on the prosody.
We were able to understand the feature of pitch and it's importance.
In our case this is the only feature we looked at
We successfully generated pitch representation.
We failed at finishing up the project on time.
Main feature: pitch
Window size: 30 ms, with overlap of 10 ms
Vector size corresponding to 15 windows
So, the training example base corresponds to (30-10)ms * 15 = 300 ms of data
Questions and non-questions
* A supervised approach
Gaussian Mixture Model
The GMM presents itself as a possible solution considering the absence of a precise parameter that could be useful to spot a question.
properties beyond those described by the individual segments, which include
Pitch is an auditory sensation in which a listener assigns musical tones to relative positions on a musical scale based primarily on their perception of the frequency of vibration.
Pitch is closely related to frequency
pitch is perception of a sound, can be percieved also subjectively
frequency is an objective, scientific attribute that can be measured
However, the number of components would be critical, once the question intonation is not an easy-to-track property.