pptx - Smart Geometry Processing Group

Report
Annotating RGBD Images of
Indoor Scenes
Yu-Shiang Wong and Hung-Kuo Chu
National Tsing Hua University
CGV LAB
SA2014.SIGGRAPH.ORG
SPONSORED BY
Outline
Motivation
Related Works
Annotation Procedure
User Study
SA2014.SIGGRAPH.ORG
SPONSORED BY
Motivation
Scene understanding is a popular topic.
RGBD dataset with high quality semantic
annotations are valuable:
Learning
Evaluations
Two fundamental problems
• Data Acquisition and Annotation
SA2014.SIGGRAPH.ORG
SPONSORED BY
Motivation
Scene understanding is a popular topic.
RGBD dataset with high quality semantic
annotations are valuable:
Learning
Evaluations
Two fundamental problems
• Data Acquisition and Annotation
SA2014.SIGGRAPH.ORG
SPONSORED BY
RGBD Indoor Datasets
Cornell-RGBD (2011-12) : 24 labeled office scenes
NYU2 (2011-12) : 1449 labeled indoor scenes
– 408,000+ RGBD videos frames (unlabeled)
SUN 3D (2013) : 415+ full captured room
– 10+ room is full labeled, annotations are propagated through
video.
UZH & ETH 3D Scanned Point Datasets (2014) :
42 x full captured room
– high quality point clouds (unlabeled)
Object Detection and Classification from Large-Scale Cluttered
Indoor Scans (EG 2014)
…
SA2014.SIGGRAPH.ORG
SPONSORED BY
Motivation
Data annotation is a painstaking and timeconsuming task
OMG! So many data need
to be annotated
SA2014.SIGGRAPH.ORG
SPONSORED BY
Motivation
Data annotation is a painstaking and timeconsuming task
Interactive tool for annotating RGBD indoor
scenes
We need a
good tool!
SA2014.SIGGRAPH.ORG
SPONSORED BY
Motivation
Data annotation is a tedious and timeconsuming task
Interactive tool for annotating RGBD indoor
scenes
Leverage both the cognitive ability of human
and computational power of machine.
SA2014.SIGGRAPH.ORG
SPONSORED BY
RELATED WORKS
SA2014.SIGGRAPH.ORG
SPONSORED BY
Image Annotation
LabelMe: a database and web-based tool for
image annotation. Russell et. al. , IJCV 2007
SUN3D: A Database of Big Spaces
Reconstructed using SfM and Object Labels,
Xiao et.al. ICCV 2013
Cheaper by the Dozen: Group Annotation of
3D Data, Boyko et. al., UIST 2014
SA2014.SIGGRAPH.ORG
SPONSORED BY
Scene Understanding using RGBD Data
Image-based
Indoor segmentation and support inference from
RGBD images. Silberman et.al. ECCV 2012.
RGB-(D) scene labeling: Features and algorithms.
Ren et. al. CVPR. 2012
Proxy-based
Imagining the unseen: Stability- based cuboid
arrangements for understanding cluttered indoor
scenes. Shao et. al., SIGGRAPH Asia 2014
PanoContext: A whole-room 3d context model for
panoramic scene understanding. Zhang et. al.,
ECCV 2014
Holistic scene understanding for 3D object
detection with rgbd cameras. , Lin et. al., ICCV 2013
3D- based reasoning with blocks, support, and
stability. Xiao et. al. CVPR 2013
SA2014.SIGGRAPH.ORG
SPONSORED BY
Annotation Procedure: Overview
Input : RGB-D image
Output: Seg., Label, Box proxy, Support structure
Machine
Output
Input
Å
User
SA2014.SIGGRAPH.ORG
SPONSORED BY
Annotation Procedure: Overview
Machine Session
Input
RGB-D
Image
Extract
Room
Draw
Scribbles
Estimate
Boxes
Annotate
Label and
Structure
Output
Annotated 3D
Structure
User Session
SA2014.SIGGRAPH.ORG
SPONSORED BY
Annotation Procedure:
Preprocessing
Estimate normal
Perform over-segmentation using
both color and normal map.
• Efficient graph based image
segmentation [Felzenszwalb et.al. 2004]
• The coarser segmentation is used for
room estimation.
• The finer segmentation is used for userassisted object segmentation.
SA2014.SIGGRAPH.ORG
SPONSORED BY
Annotation Procedure:
Extracting Room Layout
Input
RGB-D
Image
Extract
Room
Draw
Scribbles
Estimate
Boxes
Output
Annotated 3D
Structure
Annotate
Label and
Structure
Perform RANSAC fitting on each seg.
Roughly align point cloud by Gravity Info 
Find the floor segmentation by :
Ei = (1 −< ni , ye > )
+ inverse ratio of seg. size
+ normalized Y coords
Estimate wall candidates like
=
<  ,  >

* If gravity info is not available:
SA2014.SIGGRAPH.ORG
=
<  ,  >

≠
SPONSORED
BY 2012]
[Silberman
Annotation Procedure:
User Scribbles
Input
RGB-D
Image
Extract
Room
Draw
Scribbles
Estimate
Boxes
Annotate
Label and
Structure
Output
Annotated 3D
Structure
Check floor and walls
hypotheses
• If the hypotheses fail, user clicks
the segment to identify floor and
walls.
User draws scribbles to extract
the object segments
SA2014.SIGGRAPH.ORG
SPONSORED BY
User
Annotation Procedure:
Estimating Boxes
Input
RGB-D
Image
Extract
Room
Draw
Scribbles
Estimate
Boxes
Annotate
Label and
Structure
Output
Annotated 3D
Structure
• Box orientation = Find out an orthogonal
basis in 3D domain (3 unknowns direction)
• We assume one direction of box is parallel to
the normal of floor (1 unknowns direction, 1
by cross product)
SA2014.SIGGRAPH.ORG
Box Fitting Method :
1. Filtering point cloud by KNN
2. Project point cloud of a box to floor plane
3. Fit a line in 2D domain to extract a major
direction
4. Using cross product to extract
last direction.
SPONSORED
BY
Annotation Procedure:
Annotate Label and 3D Structure
Input
RGB-D
Image
Extract
Room
Draw
Scribbles
Estimate
Boxes
Annotate
Label and
Structure
Output
Annotated 3D
Structure
User Tasks :
1. Type in the object label
2. Drag an arrow to specify
the support relationships
SA2014.SIGGRAPH.ORG
SPONSORED BY
User
Annotation Procedure:
Box Quality Refinement (Optional)
Input
RGB-D
Image
Extract
Room
Draw
Scribbles
Estimate
Boxes
Annotate
Label and
Structure
Output
Annotated 3D
Structure
User Tasks :
1. Adjust the orientation of
boxes
2. Adjust the size of boxes
SA2014.SIGGRAPH.ORG
SPONSORED BY
User
USER STUDY
SA2014.SIGGRAPH.ORG
SPONSORED BY
User Study : Settings
• Select 50 x scenes across 7 scene class from
NYU2
• Recruit 2 users,
• Each user is requested to annotate 50 x scenes
• Target class : 24 merged object classes
• List : bed, chair, cabinet, dresser, television, night
stand, table, sofa, picture, pillow, …
• Each scene contains 3-6 objects
SA2014.SIGGRAPH.ORG
SPONSORED BY
User Study : Results
•
System Process Time:
calculate normal, fitting planes and boxes: < 3 sec
[in C++]
•
Annotation Time: ( 50 x Scenes )
Task Type
Mean time
per box
Mean time
per scene
Total Time
Check Room
--
1.6 sec
1.3 min
Draw Scribbles
16 sec
1 min
51 min
Type Labels
4 sec
17 sec
13 min
Drag Supports
2 sec
9 sec
7.5 min
Boxes
Adjustment
11 sec
35 sec
29 min
( Accuracy = 64 %)
TOTAL = 101 min
SA2014.SIGGRAPH.ORG
SPONSORED BY
Demo
SA2014.SIGGRAPH.ORG
SPONSORED BY
Conclusion
An interactive system to facilitate annotating
RGBD indoor scenes.
Generating high quality ground truth data with
rich annotations
Object segments
Object labels
3D geometry
3D structure
SA2014.SIGGRAPH.ORG
SPONSORED BY
On Going Work
The major bottleneck lie in manual operations:
Drawing scribbles
Refine box proxy
Typing labels
Specify structure
Incorporate inferring algorithm and 3D structure
analysis to reduce the manual burden from the
user.
SA2014.SIGGRAPH.ORG
SPONSORED BY
THANKS YOU !
SA2014.SIGGRAPH.ORG
SPONSORED BY

similar documents