Loading presentation...

Present Remotely

Send the link below via email or IM


Present to your audience

Start remote presentation

  • Invited audience members will follow you as you navigate and present
  • People invited to a presentation do not need a Prezi account
  • This link expires 10 minutes after you close the presentation
  • A maximum of 30 users can follow your presentation
  • Learn more about this feature in our knowledge base article

Do you really want to delete this prezi?

Neither you, nor the coeditors you shared it with will be able to recover it again.


MediaEval 2012

INRIA/IRISA at Placing Task of MediaEval 2012

Michele Trevisiol

on 25 May 2013

Comments (0)

Please log in to add your comment.

Report abuse

Transcript of MediaEval 2012

photo credit Nasa / Goddard Space Flight Center / Reto Stöckli Michele Trevisiol
UPF, Spain How INRIA/IRISA identifies Geographic Location of a Video Our approaces: Overview Tags Filtering Tags Pre-Processing Conclusions / Future Work Jonathan Delhumeau
INRIA, France Hervé Jégou
INRIA, France Guillaume Gravier
IRISA, France MediaEval 2012
Placing Task short description about tags pre-processing
main idea and discussion of one approach
... we use only the provided data: no external information like gazeeters, database, wikies, etc. run1: tag-based [Basic Method]
run2: tag-based [Weighted Method]
run4: content-based Data Used: About this talk: Tags Weighting numeric tags (e.g. date, year)
numeric characters from the alphanumeric tags
stop-words dictionary
common words (travel, geotag, birthday, etc.)
camera/device (iPhone, camera, Canon, etc.)
identification of machine tags in cascade we try few solutions. If there are no tags? 1) user's common location based on user common location (uploads history) based on user's social connections locations 2) social connections' common location user's hometown (used only for method in run1) 3) content-based approach 4) prior-location fixed location computed 'a priori' among all the coordinates in the training set Introduction about our approach Weighted Method (run2) we quantify the coordinates on a square grid of 0.1º
e.g. (43.71268, 10.4148) belongs to (43.7, 10.4) Given a test video with tags areas tags we compute a weighted co-occurrence matrix associating each tag with the areas it appears in create matrix tags-areas Given a test video with tags: we define in which area of interest
it is likely to belong then, we consider all the coordinates of all the previous videos/images located in that area(s) we define which coordinates has more related tags with the test video we return the coordinates
(if more than one, we apply "medoid") find the best area(s) each query vector is multiplied by the weighted tag-area matrix to find the most probable area(s) weights re-computation bm25 feature weighting
smoothing with signed SQRT
normalized (L2-norm)
redundant reduction (whitening) Previously on the training set,
for each selected area: coordinates tags we create a weighted co-occurrence matrix
associating each tag with the coordinates it belongs to tags-coordinates matrix Each query vector is multiplied by the weighted tag-coordinates matrix to find the most probable coordinates weights re-computation bm25 feature weighting
smoothing with signed SQRT
normalized (L2-norm)
redundant reduction (whitening) (latitude, longitude) how how 2) 1) identify which tags are more "geo-descriptive" Tags Weighting [tag | frequency | avg_distance]
sorted by frequency tag frequency average distance (no. of times that it appears in the whole training set) (avg. distance among all the coordinates associated) Future Work generalize the proposed tags weighting scheme (defining some specific criteria)
apply a clustering method to group the coordinates (in areas) for the 1st step
integrating external information (gazetteers, Wordnet, API of GMaps, etc.) Weighting Scheme we manually define the following thresholds: Example of tags with the top weight frequency > 200
10 < avgDistance < 50 all the tags tags that respect the constraints: Previously, on the training set: On the test set,
for each selected area: Our Results Cumulative correctly detected locations:
rate of video founds
in a radius of x km
Full transcript