### Stereo

```776 Computer
Vision
Jan-Michael Frahm, Enrique Dunn
Spring 2012
From Previous Lecture
•
•
•
•
Homographies
Fundamental matrix
Normalized 8-point Algorithm
Essential Matrix
Plane Homography for Calibrated Cameras
• In the calibrated case
o Two cameras P=K[I |0] and P’ = K’[R | t]
o A plane π=(nT,d) T
•
The homography is given by x’=Hx
H = K’(R – tnT/d)K-1
• For the plane at infinity
H = K’RK-1
The Fundamental Matrix F
I1  Fm0
m I 0
T
1 1
m1T Fm0  0
F = [e]xH = Fundamental Matrix
P0
L
m0
M
M
Epipole
l1
e1
m1
e1T F  0
P1
Hm0
The eight-point algorithm
x = (u, v, 1)T, x’ = (u’, v’, 1)T
Minimize:
N
T
2

(
x
F
x
)
 i i
i 1
under the constraint
F33 = 1
Epipolar constraint: Calibrated case
X
x’
x
x  [t  ( R x)]  0
xT E x  0 with E  [t ]R
Essential Matrix
(Longuet-Higgins, 1981)
The vectors x, t, and Rx’ are coplanar
slide: S. Lazebnik
Epipolar constraint: Calibrated case
X
x’
x
x  [t  ( R x)]  0
xT E x  0 with E  [t ]R
Cubic constraint
det(E )  0, EE T E 
1
trace(EE T )E  0
2
The vectors x, t, and Rx’ are coplanar
Essential Matrix
E  K FK '
T
slide: S. Lazebnik
Today: Binocular stereo
• Given a calibrated binocular stereo pair, fuse it to
produce a depth image
Where does the depth information come from?
Binocular stereo
• Given a calibrated binocular stereo pair, fuse it to
produce a depth image
o Humans can do it
Stereograms: Invented by Sir Charles Wheatstone, 1838
Binocular stereo
• Given a calibrated binocular stereo pair, fuse it to
produce a depth image
o Humans can do it
Autostereograms: www.magiceye.com
Binocular stereo
• Given a calibrated binocular stereo pair, fuse it to
produce a depth image
o Humans can do it
Autostereograms: www.magiceye.com
Real-time stereo
Nomad robot searches for meteorites in Antartica
http://www.frc.ri.cmu.edu/projects/meteorobot/index.html
• Used for robot navigation (and other tasks)
o Software-based real-time stereo techniques
slide: R. Szeliski
Stereo image pair
slide: R. Szeliski
Anaglyphs
http://www.rainbowsymph
ony.com/freestuff.html
(Wikipedia for images)
Public Library, Stereoscopic Looking Room, Chicago, by Phillips, 1923
slide: R. Szeliski
Stereo: epipolar geometry
• Match features along epipolar lines
epipolar
line
epipolar plane
viewing ray
slide: R. Szeliski
Simplest Case: Parallel images
• Image planes of cameras
are parallel to each other
and to the baseline
• Camera centers are at
same height
• Focal lengths are the
same
slide: S. Lazebnik
Simplest Case: Parallel images
• Image planes of cameras
are parallel to each other
and to the baseline
• Camera centers are at
same height
• Focal lengths are the
same
• Then, epipolar lines fall
along the horizontal scan
lines of the images
slide: S. Lazebnik
Essential matrix for parallel images
Epipolar constraint:
x E x  0, E  [t ]R
T
R=I
t = (T, 0, 0)
x
x’
t
 0

[a ]  a z
 a y

0 0
E  [t ]R  0 0
0 T
 az
0
ax
ay 

 ax 
0 
0 
 T 
0 
Essential matrix for parallel images
Epipolar constraint:
x E x  0, E  [t ]R
T
R=I
t = (T, 0, 0)
x
x’
t
0 0
u v 10 0
0 T
0  u  
 

 T   v   0
0  1 
0 0
E  [t ]R  0 0
0 T
 0 


u v 1  T   0
 Tv 


The y-coordinates of corresponding points are the same!
0 
 T 
0 
Tv  Tv
Depth from disparity
X
z
x
x’
f
O
f
Baseline
B
O’
B f
disparity  x  x 
z
Disparity is inversely proportional to depth!
Depth Sampling
Depth sampling for integer pixel disparity
Quadratic precision loss with depth!
Depth Sampling
Depth sampling for wider baseline
Depth Sampling
Depth sampling is in O(resolution6)
Stereo: epipolar geometry
• for two images (or images with collinear camera
centers), can find epipolar lines
• epipolar lines are the projection of the pencil of
planes passing through the centers
• Rectification: warping the input images
(perspective transformation) so that epipolar lines
are horizontal
slide: R. Szeliski
Rectification
• Project each image onto same plane, which is
parallel to the epipole
• Resample lines (and shear/stretch) to place lines in
correspondence, and minimize distortion
• [Loop and Zhang, CVPR’99]
slide: R. Szeliski
Rectification
slide: R. Szeliski
Rectification
GOOD!
slide: R. Szeliski
Problem: Rectification for forward moving cameras
• Required image can become very large (infinitely
large) when the epipole is in the image
• Alternative rectifications are available using
epipolar lines directly in the images
o Pollefeys et al. 1999, “A simple and efficient method for general motion”,
ICCV
Your basic stereo algorithm
For each epipolar line
For each pixel in the left image
• compare with every pixel on same epipolar line in right image
• pick pixel with minimum match cost
Improvement: match windows
•
This should look familar...
slide: R. Szeliski
Finding correspondences
• apply feature matching criterion (e.g.,
correlation or Lucas-Kanade) at all pixels
simultaneously
• search only over epipolar lines (many fewer
candidate positions)
slide: R. Szeliski
Correspondence search
Left
Right
scanline
Matching cost
disparity
• Slide a window along the right scanline and
compare contents of that window with the
reference window in the left image
• Matching cost: SSD or normalized correlation
slide: S. Lazebnik
Correspondence search
Left
Right
scanline
SSD
slide: S. Lazebnik
Correspondence search
Left
Right
scanline
Norm. corr
slide: S. Lazebnik
Neighborhood size
• Smaller neighborhood: more details
• Larger neighborhood: fewer isolated mistakes
•
w=3
w = 20
slide: R. Szeliski
Matching criteria
•
•
•
•
•
•
•
•
Raw pixel values (correlation)
Band-pass filtered images [Jones & Malik 92]
“Corner” like features [Zhang, …]
Edges [many people…]
Gradients [Seitz 89; Scharstein 94]
Rank statistics [Zabih & Woodfill 94]
Intervals [Birchfield and Tomasi 96]
Overview of matching metrics and their
performance:
o H. Hirschmüller and D. Scharstein, “Evaluation of Stereo Matching Costs on
Images with Radiometric Differences”, PAMI 2008
slide: R. Szeliski
• Boundary Preserving
• More Costly
Failures of correspondence search
Textureless surfaces
Occlusions, repetition
Non-Lambertian surfaces, specularities
slide: S. Lazebnik
Stereo: certainty modeling
• Compute certainty map from correlations
•
input
depth map
certainty map
slide: R. Szeliski
Results with window search
Data
Window-based matching
Ground truth
slide: S. Lazebnik
Better methods exist...
Graph cuts
Ground truth
Y. Boykov, O. Veksler, and R. Zabih, Fast Approximate Energy
Minimization via Graph Cuts, PAMI 2001
For the latest and greatest: http://www.middlebury.edu/stereo/
slide: S. Lazebnik
How can we improve window-based matching?
• The similarity constraint is local (each reference
window is matched independently)
• Need to enforce non-local correspondence
constraints
slide: S. Lazebnik
Non-local constraints
• Uniqueness
o For any point in one image, there should be at most one matching
point in the other image
slide: S. Lazebnik
Non-local constraints
• Uniqueness
o For any point in one image, there should be at most one matching point
in the other image
• Ordering
o Corresponding points should be in the same order in both views
slide: S. Lazebnik
Non-local constraints
• Uniqueness
o For any point in one image, there should be at most one matching point
in the other image
• Ordering
o Corresponding points should be in the same order in both views
Ordering constraint doesn’t hold
slide: S. Lazebnik
Non-local constraints
• Uniqueness
o For any point in one image, there should be at most one matching point
in the other image
• Ordering
o Corresponding points should be in the same order in both views
• Smoothness
o We expect disparity values to change slowly (for the most part)
slide: S. Lazebnik
Scanline stereo
• Try to coherently match pixels on the entire scanline
• Different scanlines are still optimized independently
Left image
Right image
slide: S. Lazebnik
“Shortest paths” for scan-line stereo
Left image
I
Right image
I
S left
Right
occlusion
Coccl
Left
occlusion
q
t
s
p
Sright
Coccl
Ccorr
Can be implemented with dynamic programming
Ohta & Kanade ’85, Cox et al. ‘96
Slide credit: Y. Boykov
Coherent stereo on 2D grid
• Scanline stereo generates streaking artifacts
• Can’t use dynamic programming to find spatially
coherent disparities/ correspondences on a 2D
grid
slide: S. Lazebnik
Stereo matching as energy minimization
I2
I1
W1(i )
D
W2(i+D(i ))
E ( D)   W1 (i )  W2 (i  D(i ))   
2
i
D(i )
  D(i)  D( j )
neighbors i , j
data term
smoothness term
• Energy functions of this form can be minimized
using graph cuts
Y. Boykov, O. Veksler, and R. Zabih, Fast Approximate Energy Minimization
via Graph Cuts, PAMI 2001
slide: S. Lazebnik
Active stereo with structured light
• Project “structured” light patterns onto the object
o Simplifies the correspondence problem
o Allows us to use only one camera
camera
projector
L. Zhang, B. Curless, and S. M. Seitz. Rapid Shape Acquisition Using Color Structured
Light and Multi-pass Dynamic Programming. 3DPVT 2002
slide: S. Lazebnik
Active stereo with structured light
L. Zhang, B. Curless, and S. M. Seitz. Rapid Shape Acquisition Using Color
Structured Light and Multi-pass Dynamic Programming. 3DPVT 2002
slide: S. Lazebnik
Active stereo with structured light
http://en.wikipedia.org/wiki/Structured-light_3D_scanner
slide: S. Lazebnik
Kinect: Structured infrared light
http://bbzippo.wordpress.com/2010/11/28/kinect-in-infrared/
slide: S. Lazebnik
Laser scanning
Digital Michelangelo Project
Levoy et al.
http://graphics.stanford.edu/projects/mich/
• Optical triangulation
o Project a single stripe of laser light
o Scan it across the surface of the object
o This is a very precise version of structured light scanning
Source: S. Seitz
Laser scanned models
The Digital Michelangelo Project, Levoy et al.
Source: S. Seitz
Laser scanned models
The Digital Michelangelo Project, Levoy et al.
Source: S. Seitz
Laser scanned models
The Digital Michelangelo Project, Levoy et al.
Source: S. Seitz
Laser scanned models
The Digital Michelangelo Project, Levoy et al.
Source: S. Seitz
Laser scanned models
1.0 mm resolution (56 million triangles)
The Digital Michelangelo Project, Levoy et al.
Source: S. Seitz
Aligning range images
• A single range scan is not sufficient to describe a
complex surface
• Need techniques to register multiple range images
B. Curless and M. Levoy, A Volumetric Method for Building Complex Models from Range
Images, SIGGRAPH 1996
Aligning range images
• A single range scan is not sufficient to describe a
complex surface
• Need techniques to register multiple range images
• … which brings us to multi-view stereo
```