Send the link below via email or IMCopy
Present to your audienceStart remote presentation
- Invited audience members will follow you as you navigate and present
- People invited to a presentation do not need a Prezi account
- This link expires 10 minutes after you close the presentation
- A maximum of 30 users can follow your presentation
- Learn more about this feature in our knowledge base article
content based video retrieval and video analysis
Transcript of content based video retrieval and video analysis
Let's Remember !!
A content-based retrieval system processes the information contained in image data and creates an abstraction of its content in terms of visual attributes.
1- Let's Remember !
2-Problem definition and solution
3-input and expected output
5-Shot Detection (Abrupt transition Algo. and demo )
6-Key Frame Extraction (Entropy Algo.)
7-Module(1) : Low Level.
8-Module(2) : Audio.
9-Module(3) : Face Recognition.
10-Time plane and Current state
Problem definition and solution
Most videos nowadays have long duration and one title might contain many categorize and the title of the video may not be descriptive enough .
Input and expected output
Prof. Dr/El-Syaed Al-Horbaty
Dr/Mohamed Abd Al-Majeed
Ahmed Saeed Ibrahim
Aya Ahmed Serry
Ibrahim Mohamed Amer
Hajar Adel Ahmed
Mostafa Fouad Mahmoud
Zainab Mohamed Fouad
Categorize and annotate Video by analyzing it.
A shot represents a sequence of frames captured from a unique and continuous record from a camera
Module (1):Low Level Features
Key Frame Extraction
Key frame is the frame which can represent the salient content and information of the shot .
Module (3):Text Recognition
Module (2):Face Recognition
Category & Logos
Category & Persons
Category & Keywords
Category & Keywords
1)Get gray histogram for each Frame
2)Get the histogram deference
between each consecutive frames using :
3) Get the threshold value using :
4)For each two consecutive frames : if ( SD > Tb) It is a New Transition .
New abrupt transition
1) Shot boundary based approach :
First /Middle /Last
Key Frame 1 :
Key Frame 2 :
Key Frame 3 :
3) Visual content based approach :
What is Entropy ?
Entropy is used to measure the amount of Information.
It describes how much randomness (or uncertainty) there is in a signal or an image.
Then , how to extract the key frames ?!
1)Sort the Entropies.
2)Get the threshold value :
3)Calculate the difference between two consecutive frames:
If the difference > a specific value It is a key frame
Why Entropy ??
The main advantage of this algorithm is that it segments the shots with high accuracy into key-frames using a semantic meaning ”Entropy” .
-Two Main Topics :
1)Feature Vector Extraction
1) Texture Features :
Gabor Wavelet :
a) Mean-squared energy
b) Mean Amplitude
Why Gabor Wavelet?
-It is Scale & orientation invariant.
-Distortion tolerance space.
-Good performance .
2) Color Features :
Global Feature Extraction
Z-score (most common).
(sensitive to outliers ).
Med-Mad (insensitive to outliers / low efficiency ) .
How to measure the similarity between two Matrices/Histograms:
The diffusion distance is robust to :
2) Lighting change
Modified hausdorff (Fast and accurate)
Quadratic form distance
Euclidean distance (high efficiency )
Earth mover distance
(low efficiency / accurate results )
Histogram intersection (high efficiency )
Used For Histograms
We are using matlab to detect faces locations within frame then we extract the detected faces to use it in face recognition module .
Algorithm used :
- Viola Jones Algorithm
It is a generic object detection method that can be trained to detect a variety of object classes but it is used commonly in face detection
The features are computed using rectangular filters by subtracting the image pixels in the shaded region from the white region.
A, B, C, D : Rectangular filters for features computation
The existence of the feature is determined by comparing the value computed from the rectangular filter with the training data
Results And Practical Work
After detecting faces from input frame we attempt to recognize it according to our database .
Currently we are using a computer vision library called EMGU to recognize faces . The algorithm behind is called Eigen faces and PCA
The problem with this method that the recognition results become worse by increasing the database subjects as well as it become worse with different illumination and face orientations
Face Recognition Result
Is the ability of the machine or the program to identify words and phrases in spoken language and convert the to a machine readable format.
-According to the most repeated keyword.
Audio extraction from the shot.