Large Scale Image Retrieval

Report
Bag of Features Approach:
recent work, using geometric
information
Problem
• Search for object occurrences in very large
image collection
2 sub problems
• Object Category Recognition and Specific
Object Recognition
Motivation
• Look for product information
• Look for similar products
Related work on large scale image
search
• Most systems build upon the BoF framework [Sivic &
Zisserman 03]
– Large (hierarchical) vocabularies [Nister Stewenius 06]
– Improved descriptor representation [Jégou et al 08, Philbin
et al 08]
– Geometry used in index [Jégou et al 08, Perdoc’h et al 09]
– Query expansion [Chum et al 07]
–…
• Efficiency improved by:
– Min-hash and Geometrical min-hash [Chum et al. 07-09]
– Compressing the BoF representation [Jégou et al. 09]
Local Features - SIFT
Creating a visual vocabulary
1
2
3
4
Inverted Index
Index construction
Searching
Use geometry
• Possible directions:
– Change/optimize spatial verification stage
– Insert a new geometric information to the index
• Ordered BOF
• Bundled features
• Visual phrases
– Change the searching algorithm
Survey for today
• Spatial Bag-of-features [Cao, CVPR2010]
• Image Retrieval with Geometry-Preserving
Visual Phrases [Zhang Jia Chen, CVPR2011]
• Smooth Object Retrieval using a Bag of
Boundaries [Arandjelovi Zisserman, ICCV2011]
Spatial BOF
• Basic idea:
Spatial BOF
• Constructing linear and circular ordered bagof-features:
Spatial BOF
• Translation invariance:
Spatial BOF
• Pros:
– Gets better performance than BOF+RANSAC for large
scale dataset*
– Same format as standard BOF
• Cons:
– Is dataset dependent because of need of training
• Do not present the results for large scale dataset with
transfer learning from another dataset
• Future work
– Check it with cross training for large dataset.
Otherwise, it is not worth working further.
Geometry-Preserving Visual Phrases
• Basic idea:
Geometry-Preserving Visual Phrases
• Representation
– Quantize image to 10x10 grid
– Histogram of GVPs of length k
– GVP dictionary size is “choose k from N visual
words”
Geometry-Preserving Visual Phrases
• Pros:
– Outperforms BOV + RANSAC
• Cons:
– Only translation invariant because of memory
• Future work
BOF for smooth objects
Idea:
Segment
Query object
Gradient
The information
used for retrieval
BOF for smooth objects
Results:
BOF for smooth objects
Segmentation phase
• Over segmentation with super-pixels
• Classification of super-pixels:
• 3208 feature vector (median(Mag(Grad)), 4 bits, color
histogram, BOF)
• SVM
• Post-processing
BOF for smooth objects
Boundary description phase:
• Sample points on the boundary
• Calculate HoG at each point in 3 scales
340
dimensional
L2 normalized
vector
* The descriptor is not rotation invariant
BOF for smooth objects
Retrieval procedure:
• Boundary descripors are quantized (k=10k)
• Standard BOF scheme*
• Spatial verification for top 200 with loose
affine homography (errors up to 100pixs)
* No spatial information is recorded in the histogram
BOF for smooth objects
• Pros:
– Solves the smooth object retrieval problem
– Fast
• Cons:
– Is dataset dependent because of need of training
– Limited to objects with “solid” materials –
segmentation has to catch the object’s boundary
• Future work
– Eliminate the training step
Summary
• There is an active research in the field of CBIR
to exploit geometry information.
• Each method with its limitations
• Still no widely accepted solution
– Like spatial verification with RANSAC

similar documents