Standard Image-based Indoor Localization (1/3)

Report
Localization in indoor environments by
querying omnidirectional visual maps
using perspective images
Miguel Lourenco, V. Pedro and João P. Barreto
ICRA 2012
Standard Image-based Indoor Localization (1/3)
• How can a robot equipped with standard camera
perform indoor localization?
Query image
• Establishing correspondences between the query image
and a database of geo-referenced images [Cummins08, Chen11,
Hansen01]
Miguel Lourenço– Institute for Systems and Robotics, Faculty of Science and Technology, University of Coimbra - 16 May 2012 - Slide 2
Standard Image-based Indoor Localization (2/3)
SIFT descriptors
Quantization + td-idf weighting
Image Database
+
Inverted file
Querying
• Building a detailed database of environments is troublesome
Number of Images
Storage
Time
Miguel Lourenço– Institute for Systems and Robotics, Faculty of Science and Technology, University of Coimbra - 16 May 2012 - Slide 3
Problem Statement
Query image
•A complete coverage of the environment can be performed
with an para-catadioptric camera
• Distortion increases the appearance difference between the images
•Our Contribution: A new model-based SIFT method for matching
between hybrid imaging systems
Miguel Lourenço– Institute for Systems and Robotics, Faculty of Science and Technology, University of Coimbra - 16 May 2012 - Slide 4
Presentation Outline
• Matching in Hybrid Imaging Systems
 Comparison / drawbacks of the standard approaches
 Improvements to the SIFT detector and descriptor
• Image-based IL using Hybrid Imaging Systems
 Comparison of several database description schemes
 Comparison of two searching approaches : BoV vs GVP
Miguel Lourenço– Institute for Systems and Robotics, Faculty of Science and Technology, University of Coimbra - 16 May 2012 - Slide 5
Matching in Hybrid Imaging Systems
• Using SIFT [Lowe04] on both paracatadioptric and perspective images
provides poor matching results [Puig08]
10% inliers
• A straightforward solution is to
apply SIFT in a virtual camera
perspective (VCP) [Schönbein11]
• Standard approaches either
render a Polar [Puig08] or a
Cylindrical panorama [Krishnan08]
Polar
Cylinder
Miguel Lourenço– Institute for Systems and Robotics, Faculty of Science and Technology, University of Coimbra - 16 May 2012 - Slide 6
Implicit cylindrical rectification - cylSIFT
• Render synthetic views require to reconstruct the image signal
 Interpolation artifacts severely affect SIFT performance [Lourenco12]
• Based on our previous work [Lourenco12] we propose to perform
the cylindrical rectification implicitly inside SIFT framework cylSIFT
• How does SIFT work ?
 Image salient points detected in a scale space framework
 SIFT descriptor is computed based on local image gradient
Miguel Lourenço– Institute for Systems and Robotics, Faculty of Science and Technology, University of Coimbra - 16 May 2012 - Slide 7
Implicit cylindrical rectification – cylSIFT detector
• Render the cylinder before applying SIFT adds extra
computational time and interpolation artifacts
Rectification > 2sec (Matlab )
*
• We avoid the reconstruction artifacts by using an adaptive
Gaussian filter [Lourenco12]
Miguel Lourenço– Institute for Systems and Robotics, Faculty of Science and Technology, University of Coimbra - 16 May 2012 - Slide 8
Standard vs Adaptive Gaussian smoothing
• Inherent properties of the standard Gaussian filter
 Space invariant filtering
 Decouple convolution in X and Y directions
• Simplification of the adaptive filter
• Advantages of the Simplified Adaptive Filter
 Isotropic filter that can be decoupled for each image radius
 A filter bank can be computed offline and loaded into
memory
Miguel Lourenço–
Institute for Systems and Robotics, Faculty of Science and Technology, University of Coimbra - 16 May 2012 - Slide 9
Implicit cylindrical rectification – cylSIFT description
• Non-linear distortion modifies the local structures in the
image and, by consequence, the gradients are affected
• Changes in local gradients of the image deteriorates SIFT
descriptor performance
• Proposed Solution: Compute gradients in the
omnidirectional image and implicitly correct them using
the Jacobian matrix of the cylindrical mapping function
Miguel Lourenço– Institute for Systems and Robotics, Faculty of Science and Technology, University of Coimbra - 16 May 2012 - Slide 10
Detection and Matching evaluation
Set A
Set B
Set C
Set D
• cylSIFT completely avoids interpolation of the image signal
 Better repeatability and matching with less computational burden
• cylSIFT has similar performance to the VCP approach
 The VCP requires a priori knowledge of the view to render to minimize
viewpoints changes between the query and VCP image
Miguel Lourenço– Institute for Systems and Robotics, Faculty of Science and Technology, University of Coimbra - 16 May 2012 - Slide 11
Standard Image-based Indoor Localization (3/3)
• How can a robot use a standard camera for performing
indoor localization?
Query image
• Compare of two image searching schemes
 Standard Bags of Visual Words (BoV)
 Geometry preserving Visual Phrases (GVP) [Zhang11]
Miguel Lourenço– Institute for Systems and Robotics, Faculty of Science and Technology, University of Coimbra - 16 May 2012 - Slide 12
BoV: Standard Bags of Visual Words (BoV)
Length: Dictionary
Size
…
• Images are represented as the histogram of words
• Drawback: Discard the spatial relation between words
 Spatial layout can be relevant for disambiguate situations of
perceptual aliasing
Miguel Lourenço– Institute for Systems and Robotics, Faculty of Science and Technology, University of Coimbra - 16 May 2012 - Slide 13
GVP: Encoding weak geometric constraints [Zhang11]
A B
I
A B
I’
 y  y ' y
-3
-2
-1
0
1
2
3
4
AB
-1 0
5
1
2
3
4
 x  x ' x
Offset space
• Group of Visual Words in a certain layout form a Visual Phrase
• GVP incorporate geometric constraints at the searching step
Miguel Lourenço– Institute for Systems and Robotics, Faculty of Science and Technology, University of Coimbra - 16 May 2012 - Slide 14
Indoor Location Recognition - Experimental Setup
• Our database covers 2 teaching buildings of our campus
 118 para-catadioptric images
 451 perspective images are used to query the database
• The environment suffers from high perceptual aliasing
Query image
?
Miguel Lourenço– Institute for Systems and Robotics, Faculty of Science and Technology, University of Coimbra - 16 May 2012 - Slide 15
Retrieval Results
BoV – TOP 1
Re-ranking - Top 5
GVP– TOP 1
• cylSIFT takes full advantage of its matching capabilities
 Interpolation avoidance assure more distinctive descriptors
• 10% improvement of the localization success when compared
with a state-of-the-art approach (Cylinder + GVP) [Chen11]
• Indoor recognition systems benefit with the usage of GVP
 Robustness against perceptual aliasing (spatial layout matters)
Miguel Lourenço– Institute for Systems and Robotics, Faculty of Science and Technology, University of Coimbra - 16 May 2012 - Slide 16
Take home messages
• Interpolation artifacts affect image retrieval
 cylSIFT offers better retrieval performance in hybrid imaging
systems at a marginal computational cost when compared to
the standard SIFT algorithm
• cylSIFT can be useful for other applications than
localization
 Hybrid fundamental matrix estimation [Puig08]
• First work that uses a hybrid imaging systems for image
retrieval without the need of rectification
Miguel Lourenço– Institute for Systems and Robotics, Faculty of Science and Technology, University of Coimbra - 16 May 2012 - Slide 17
Thanks for coming
Questions?
Code and dataset releases:
http://arthronav.isr.uc.pt/~mlourenco/OmniSearch/
Miguel Lourenço– Institute for Systems and Robotics, Faculty of Science and Technology, University of Coimbra - 16 May 2012 - Slide 18

similar documents