### Stereo

```776 Computer
Vision
Jan-Michael Frahm, Enrique Dunn
Spring 2012
From Previous Lecture
•
•
•
•
Homographies
Fundamental matrix
Normalized 8-point Algorithm
Essential Matrix
Plane Homography for Calibrated Cameras
• In the calibrated case
o Two cameras P=K[I |0] and P’ = K’[R | t]
o A plane π=(nT,d) T
•
The homography is given by x’=Hx
H = K’(R – tnT/d)K-1
• For the plane at infinity
H = K’RK-1
The Fundamental Matrix F
I1  Fm0
m I 0
T
1 1
m1T Fm0  0
F = [e]xH = Fundamental Matrix
P0
L
m0
M
M
Epipole
l1
e1
m1
e1T F  0
P1
Hm0
The eight-point algorithm
x = (u, v, 1)T, x’ = (u’, v’, 1)T
Minimize:
N
T
2

(
x
F
x
)
 i i
i 1
under the constraint
F33 = 1
Epipolar constraint: Calibrated case
X
x’
x
x  [t  ( R x)]  0
xT E x  0 with E  [t ]R
Essential Matrix
(Longuet-Higgins, 1981)
The vectors x, t, and Rx’ are coplanar
slide: S. Lazebnik
Epipolar constraint: Calibrated case
X
x’
x
x  [t  ( R x)]  0
xT E x  0 with E  [t ]R
Cubic constraint
det(E )  0, EE T E 
1
trace(EE T )E  0
2
The vectors x, t, and Rx’ are coplanar
Essential Matrix
E  K FK '
T
slide: S. Lazebnik
Today: Binocular stereo
• Given a calibrated binocular stereo pair, fuse it to
produce a depth image
Where does the depth information come from?
Binocular stereo
• Given a calibrated binocular stereo pair, fuse it to
produce a depth image
o Humans can do it
Stereograms: Invented by Sir Charles Wheatstone, 1838
Binocular stereo
• Given a calibrated binocular stereo pair, fuse it to
produce a depth image
o Humans can do it
Autostereograms: www.magiceye.com
Binocular stereo
• Given a calibrated binocular stereo pair, fuse it to
produce a depth image
o Humans can do it
Autostereograms: www.magiceye.com
Real-time stereo
Nomad robot searches for meteorites in Antartica
http://www.frc.ri.cmu.edu/projects/meteorobot/index.html
o Software-based real-time stereo techniques
slide: R. Szeliski
Stereo image pair
slide: R. Szeliski
Anaglyphs
http://www.rainbowsymph
ony.com/freestuff.html
(Wikipedia for images)
Public Library, Stereoscopic Looking Room, Chicago, by Phillips, 1923
slide: R. Szeliski
Stereo: epipolar geometry
• Match features along epipolar lines
epipolar
line
epipolar plane
viewing ray
slide: R. Szeliski
Simplest Case: Parallel images
• Image planes of cameras
are parallel to each other
and to the baseline
• Camera centers are at
same height
• Focal lengths are the
same
slide: S. Lazebnik
Simplest Case: Parallel images
• Image planes of cameras
are parallel to each other
and to the baseline
• Camera centers are at
same height
• Focal lengths are the
same
• Then, epipolar lines fall
along the horizontal scan
lines of the images
slide: S. Lazebnik
Essential matrix for parallel images
Epipolar constraint:
x E x  0, E  [t ]R
T
R=I
t = (T, 0, 0)
x
x’
t
 0

[a ]  a z
 a y

0 0
E  [t ]R  0 0
0 T
 az
0
ax
ay 

 ax 
0 
0 
 T 
0 
Essential matrix for parallel images
Epipolar constraint:
x E x  0, E  [t ]R
T
R=I
t = (T, 0, 0)
x
x’
t
0 0
u v 10 0
0 T
0  u  
 

 T   v   0
0  1 
0 0
E  [t ]R  0 0
0 T
 0 


u v 1  T   0
 Tv 


The y-coordinates of corresponding points are the same!
0 
 T 
0 
Tv  Tv
Depth from disparity
X
z
x
x’
f
O
f
Baseline
B
O’
B f
disparity  x  x 
z
Disparity is inversely proportional to depth!
Depth Sampling
Depth sampling for integer pixel disparity
Depth Sampling
Depth sampling for wider baseline
Depth Sampling
Depth sampling is in O(resolution6)
Stereo: epipolar geometry
• for two images (or images with collinear camera
centers), can find epipolar lines
• epipolar lines are the projection of the pencil of
planes passing through the centers
• Rectification: warping the input images
(perspective transformation) so that epipolar lines
are horizontal
slide: R. Szeliski
Rectification
• Project each image onto same plane, which is
parallel to the epipole
• Resample lines (and shear/stretch) to place lines in
correspondence, and minimize distortion
• [Loop and Zhang, CVPR’99]
slide: R. Szeliski
Rectification
slide: R. Szeliski
Rectification
GOOD!
slide: R. Szeliski
Problem: Rectification for forward moving cameras
• Required image can become very large (infinitely
large) when the epipole is in the image
• Alternative rectifications are available using
epipolar lines directly in the images
o Pollefeys et al. 1999, “A simple and efficient method for general motion”,
ICCV
For each epipolar line
For each pixel in the left image
• compare with every pixel on same epipolar line in right image
• pick pixel with minimum match cost
Improvement: match windows
•
This should look familar...
slide: R. Szeliski
Finding correspondences
• apply feature matching criterion (e.g.,
correlation or Lucas-Kanade) at all pixels
simultaneously
• search only over epipolar lines (many fewer
candidate positions)
slide: R. Szeliski
Correspondence search
Left
Right
scanline
Matching cost
disparity
• Slide a window along the right scanline and
compare contents of that window with the
reference window in the left image
• Matching cost: SSD or normalized correlation
slide: S. Lazebnik
Correspondence search
Left
Right
scanline
SSD
slide: S. Lazebnik
Correspondence search
Left
Right
scanline
Norm. corr
slide: S. Lazebnik
Neighborhood size
• Smaller neighborhood: more details
• Larger neighborhood: fewer isolated mistakes
•
w=3
w = 20
slide: R. Szeliski
Matching criteria
•
•
•
•
•
•
•
•
Raw pixel values (correlation)
Band-pass filtered images [Jones & Malik 92]
“Corner” like features [Zhang, …]
Edges [many people…]
Rank statistics [Zabih & Woodfill 94]
Intervals [Birchfield and Tomasi 96]
Overview of matching metrics and their
performance:
o H. Hirschmüller and D. Scharstein, “Evaluation of Stereo Matching Costs on
Images with Radiometric Differences”, PAMI 2008
slide: R. Szeliski
• Boundary Preserving
• More Costly
Failures of correspondence search
Textureless surfaces
Occlusions, repetition
Non-Lambertian surfaces, specularities
slide: S. Lazebnik
Stereo: certainty modeling
• Compute certainty map from correlations
•
input
depth map
certainty map
slide: R. Szeliski
Results with window search
Data
Window-based matching
Ground truth
slide: S. Lazebnik
Better methods exist...
Graph cuts
Ground truth
Y. Boykov, O. Veksler, and R. Zabih, Fast Approximate Energy
Minimization via Graph Cuts, PAMI 2001
For the latest and greatest: http://www.middlebury.edu/stereo/
slide: S. Lazebnik
How can we improve window-based matching?
• The similarity constraint is local (each reference
window is matched independently)
• Need to enforce non-local correspondence
constraints
slide: S. Lazebnik
Non-local constraints
• Uniqueness
o For any point in one image, there should be at most one matching
point in the other image
slide: S. Lazebnik
Non-local constraints
• Uniqueness
o For any point in one image, there should be at most one matching point
in the other image
• Ordering
o Corresponding points should be in the same order in both views
slide: S. Lazebnik
Non-local constraints
• Uniqueness
o For any point in one image, there should be at most one matching point
in the other image
• Ordering
o Corresponding points should be in the same order in both views
Ordering constraint doesn’t hold
slide: S. Lazebnik
Non-local constraints
• Uniqueness
o For any point in one image, there should be at most one matching point
in the other image
• Ordering
o Corresponding points should be in the same order in both views
• Smoothness
o We expect disparity values to change slowly (for the most part)
slide: S. Lazebnik
Scanline stereo
• Try to coherently match pixels on the entire scanline
• Different scanlines are still optimized independently
Left image
Right image
slide: S. Lazebnik
“Shortest paths” for scan-line stereo
Left image
I
Right image
I
S left
Right
occlusion
Coccl
Left
occlusion
q
t
s
p
Sright
Coccl
Ccorr
Can be implemented with dynamic programming
Ohta & Kanade ’85, Cox et al. ‘96
Slide credit: Y. Boykov
Coherent stereo on 2D grid
• Scanline stereo generates streaking artifacts
• Can’t use dynamic programming to find spatially
coherent disparities/ correspondences on a 2D
grid
slide: S. Lazebnik
Stereo matching as energy minimization
I2
I1
W1(i )
D
W2(i+D(i ))
E ( D)   W1 (i )  W2 (i  D(i ))   
2
i
D(i )
  D(i)  D( j )
neighbors i , j
data term
smoothness term
• Energy functions of this form can be minimized
using graph cuts
Y. Boykov, O. Veksler, and R. Zabih, Fast Approximate Energy Minimization
via Graph Cuts, PAMI 2001
slide: S. Lazebnik
Active stereo with structured light
• Project “structured” light patterns onto the object
o Simplifies the correspondence problem
o Allows us to use only one camera
camera
projector
L. Zhang, B. Curless, and S. M. Seitz. Rapid Shape Acquisition Using Color Structured
Light and Multi-pass Dynamic Programming. 3DPVT 2002
slide: S. Lazebnik
Active stereo with structured light
L. Zhang, B. Curless, and S. M. Seitz. Rapid Shape Acquisition Using Color
Structured Light and Multi-pass Dynamic Programming. 3DPVT 2002
slide: S. Lazebnik
Active stereo with structured light
http://en.wikipedia.org/wiki/Structured-light_3D_scanner
slide: S. Lazebnik
Kinect: Structured infrared light
http://bbzippo.wordpress.com/2010/11/28/kinect-in-infrared/
slide: S. Lazebnik
Laser scanning
Digital Michelangelo Project
Levoy et al.
http://graphics.stanford.edu/projects/mich/
• Optical triangulation
o Project a single stripe of laser light
o Scan it across the surface of the object
o This is a very precise version of structured light scanning
Source: S. Seitz
Laser scanned models
The Digital Michelangelo Project, Levoy et al.
Source: S. Seitz
Laser scanned models
The Digital Michelangelo Project, Levoy et al.
Source: S. Seitz
Laser scanned models
The Digital Michelangelo Project, Levoy et al.
Source: S. Seitz
Laser scanned models
The Digital Michelangelo Project, Levoy et al.
Source: S. Seitz
Laser scanned models
1.0 mm resolution (56 million triangles)
The Digital Michelangelo Project, Levoy et al.
Source: S. Seitz
Aligning range images
• A single range scan is not sufficient to describe a
complex surface
• Need techniques to register multiple range images
B. Curless and M. Levoy, A Volumetric Method for Building Complex Models from Range
Images, SIGGRAPH 1996
Aligning range images
• A single range scan is not sufficient to describe a
complex surface
• Need techniques to register multiple range images
• … which brings us to multi-view stereo
```