Loading presentation...

Present Remotely

Send the link below via email or IM

Copy

Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.

DeleteCancel

Make your likes visible on Facebook?

Connect your Facebook account to Prezi and let your likes appear on your timeline.
You can change this under Settings & Account at any time.

No, thanks

Large-Scale Multimedia Exploration with Adaptive Similarity Measures

presentation at Klagenfurt University
by

Christian Beecks

on 12 April 2013

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of Large-Scale Multimedia Exploration with Adaptive Similarity Measures

Large-Scale Multimedia Exploration with Adaptive Similarity Measures Christian Beecks Data Management and Data Exploration Group
RWTH Aachen University
Germany

10th January 2012 Idea:
generate a large visual vocabulary comprising visual words
represent each image by a high-dimensional vector of visual word frequencies
compare these vectors by cosine similarity

Many extensions have been proposed:
Hamming Embedding
Asymmetric Hamming Embedding
Compressed Fisher vectors
Vector of locally aggregated descriptors Bag-of-visual-words Model Search for similar images based on their contents

Generation of image representation models:
detection of interesting points/regions
extraction of local feature descriptors
compact representation of the descriptors

Prominent classes of image representation models:
vector-based (bag-of-visual-words model)
set-based (feature signature similarity model) Content-based Image Retrieval Idea:
do use object-specific visual vocabularies
representation of each image by an individual bag of clustered feature descriptors
compare these bags (feature signatures) by adaptive distances Feature Signature Similarity Model dense sampling: SIFT descriptors: feature histogram: interesting point detection (here: dense sampling) representation of each image by using the visual vocabulary feature descriptor extraction generation of a large visual vocabulary by k-means clustering Properties of the feature signature similarity model:
no need of a static visual vocabulary!
each multimedia object generates its own local vocabulary
locally adaptation to quality requirements/restrictions
achieves a good balance between expressiveness and efficiency interesting point detection (here: dense sampling) clustering H. Jegou, M. Douze, C. Schmid, P. Pérez: Aggregating local descriptors into a compact image representation. CVPR 2010: 3304-3311 F. Perronnin, Y. Liu, J. Sánchez, H. Poirier: Large-scale image retrieval with compressed Fisher vectors. CVPR 2010: 3384-3391 H. Jegou, M. Douze, C. Schmid: Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search. ECCV (1) 2008: 304-317 M. Jain, H. Jegou, P. Gros: Asymmetric hamming embedding: taking the best of our bits for large scale image search. ACM Multimedia 2011: 1441-1444 the image is taken from the MIR Flickr database:
M. J. Huiskes, M. S. Lew: The MIR flickr retrieval evaluation. Multimedia Information Retrieval 2008: 39-43 J. Sivic, A. Zisserman: Video Google: A Text Retrieval Approach to Object Matching in Videos. ICCV 2003: 1470-1477 1. 2. 3. 4. C. Beecks, M. S. Uysal, T. Seidl: A comparative study of similarity measures for content-based multimedia retrieval. ICME 2010: 1552-1557 1. feature descriptor extraction 2. 3. Geographic and Personal Information Aachen (Aix-la-Chapelle) Germany's most westerly city
population: 258,050
close to the Dutch and Belgian border (approx. 6km)

some impressions: images are taken from google image search
and http://de.wikipedia.org/wiki/Aachen source: google maps RWTH Aachen University established 10. October 1870
currently more than 32,000 students (of which 6,213 are new enrolments)
465 professors
4,274 academic staff (research assistans, etc.)
2,364 non-academic staff images are taken from google image search
and http://de.wikipedia.org/wiki/Aachen figures are taken from Department 6.0 – Planning, Development and Controlling Office, Division 6.4, RWTH Aachen University, May 2011 Department of Computer Science More than 30 research groups: Data Management and Data Exploration Group Some facts:
head of the group: Prof. Dr. Thomas Seidl
established 2002
currently around 13 researchers

Research areas:
Data Analysis and Knowledge Extraction
Exploration of Multimedia Databases
Fast Access to Spatial-Temporal Data Who Am I? Dipl.-Inform. Christian Beecks
Study: Computer Science at RWTH Aachen University from 2001-2007 (Thesis: Relevance Feedback for EMD-based Similarity Search)
Currently: PhD student since 2007 http://dme.rwth-aachen.de/en/team/beecks http://dme.rwth-aachen.de/ Research interests include:
Efficient Content-Based Multimedia Retrieval and Exploration
Adaptive Similarity Measures
Signature Quadratic Form Distance A similarity model
formalizes the notion of similarity
consists of a feature representation
consists of a distance/scoring/similarity function as (dis)similarity measure

Possible application areas
retrieval: image, video, audio, music, text search
content analysis: copy, duplicate, near-duplicate detection
mining: classification, clustering, outlier detection
exploration: browsing and navigating through large databases (Distance-based) Similarity Model J. Sivic, A. Zisserman: Video Google: A Text Retrieval Approach to Object Matching in Videos. ICCV 2003: 1470-1477 How similar are two images? Earth Mover's Distance (transformation-based)


Perceptually Modified Hausdorff Distance (matching-based)


Signature Quadratic Form Distance (correlation-based) Some Distances for Feature Signatures C. Beecks, M. S. Uysal, T. Seidl: Signature Quadratic Form Distance. CIVR 2010: 438-445 Y. Rubner, C. Tomasi, L. J. Guibas: The Earth Mover's Distance as a Metric for Image Retrieval. International Journal of Computer Vision 40(2): 99-121 (2000) B. G. Park, K. M. Lee, S. U. Lee: Color-Based Image Retrieval Using Perceptually Modified Hausdorff Distance. EURASIP J. Image and Video Processing (2008) Feature Signature A feature signature adjusts to a multimedia object by
representing the object by a set of (local) feature descriptors
clustering these feature descriptors
storing the cluster centroids
storing the weights

Given an object o, its feature signature is defined as: Apply cross-dimension concept of the Quadratic Form Distance to compare feature signatures of different size and structure

Given two feature signatures: and
the Signature Quadratic Form Distance is defined as follows:



Properties:
concatenation of weights
similarity matrix is determined dynamically by using a similarity function comparing centroids
each entry models similarity between two centroids Signature Quadratic Form Distance C. Faloutsos, R. Barber, M. Flickner, J. Hafner, W. Niblack, D. Petkovic, W. Equitz. Efficient and Effective Querying by Image Content. Journal of Intelligent Information Systems 1994.





by using we obtain the similarity matrix:




thus the Signature Quadratic Form Distance becomes: Example Summary and Conclusion (Part1 ) Image similarity can be modelled by
bag-of-visual-words model (large global visual vocabulary)
feature signature model (object-specific visual vocabularies)

When using the feature signature model, the Signature Quadartic Form Distance has shown the highest retrieval performance and stability

We have also shown that
the SQFD can be applied to Gaussian mixture models
the SQFD can be indexed by metric and ptolemaic indexing structures Content-based Multimedia Retrieval
directed search for particular objects and topics

assumption: know what to search for

retrieval is well supported by indexing methods Retrieval vs. Exploration Content-based Multimedia Exploration
undirected search for objects and topic in known or unknown databases

assumption: no clear or only vague query formulation

not properly supported by indexing methods Similarity preserving visualization that projects similarity relationship into a low-dimensional space

Various interaction functionalities:
zoom in/out
slide and rotate
add/remove objects
modify object position
... Processing exploration queries given a similarity-based layout
and a (distance-based) similarity measure
find the best fitting database object for the query








measure the similarity deviation
return object with minimal deviation what's behind there? Filter/Refinement Approach choose nearest objects as filter objects
generate filter ranking with



refine candidate objects from filter ranking with



complete as it holds that
works for any kind of similarity-based layout and (distance-based) similarity model what's behind there? Indexing Approach estimate the similarity deviation by



use this estimation inside a hierarchical index structure





results are also complete
works for any kind of similarity-based layout and (distance-based) similarity model what's behind there? } exploration queries Similarity-based Layouts C. Beecks, S. Wiedenfeld, T. Seidl: Improving the Efficiency of Content-Based Multimedia Exploration. ICPR 2010: 3163-3166 C. Beecks, P. Driessen, T. Seidl: Index support for content-based multimedia exploration. ACM Multimedia 2010: 999-1002 C. Beecks, A. M. Ivanescu, S. Kirchhoff, T. Seidl: Modeling multimedia contents through probabilistic feature signatures. ACM Multimedia 2011: 1433-1436 C. Beecks, A. M. Ivanescu, S. Kirchhoff, T. Seidl: Modeling Image Similarity by Gaussian Mixture Models and the Signature Quadratic Form Distance. IEEE ICCV 2011. J. Lokoc, M. L. Hetland, T. Skopal, C. Beecks: Ptolemaic indexing of the signature quadratic form distance. SISAP 2011: 9-16 C. Beecks, J. Lokoc, T. Seidl, T. Skopal: Indexing the signature quadratic form distance for efficient content-based multimedia retrieval. ICMR 2011: 24 images are taken from the Corel Wang database: J. Z. Wang, J. Li, G. Wiederhold: SIMPLIcity: Semantics-Sensitive Integrated Matching for Picture LIbraries. IEEE TPAMI 23(9): 947-963 (2001) images are taken from the MIR Flickr database:
M. J. Huiskes, M. S. Lew: The MIR flickr retrieval evaluation. Multimedia Information Retrieval 2008: 39-43 images are taken from the MIR Flickr database:
M. J. Huiskes, M. S. Lew: The MIR flickr retrieval evaluation. Multimedia Information Retrieval 2008: 39-43 Properties Both approaches work well in theory, however they are not applicable to large-scale multimedia databases:
either it takes to long to build the index (approx. 1 week to index 1M objects with the M-tree)
or it takes too long to process exploration queries (a few minutes for querying 1M objects with F/R approach)


We are investigating a novel pivot-based approach:
select a set of pivot objects from the database
represent each object by a vector of distances to these pivot objects (so-called pivot table)
process pivot table as a filter-step during query time Different Types of Similarity-based Layouts grid-based: list-based: unconstrained: Preliminary Results Comparison of query response times: (Thanks to Marco Faber!)
1M images
similarity model: SQFD + feature signatures (10 components)
black points depict objects in the layout







Pivot-based approach is significantly faster than the other approaches Summary Differences between Retrieval and Exploration
directed vs. undirected search

Similarity-based Layouts exist in different variants:
unconstrained
list-based
grid-based

Two fundamental approaches to process exploration queries:
filter-and-refinement
index-based

A novel pivot-based approach (1M < 1sec) Adaptive Distance-based Similarity Measures Approaches to Content-based Multimedia Exploration intra-similarities inter-similarities Signature Quadratic Form Distance depends on both: required: a (distance-based) similarity model see white board illustration? F/R approach Indexing approach pivot-based approach Thanks for your attention!
Full transcript