Sketch Tokens - People

Report
Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection
Joseph J. Lim (MIT), C. Lawrence Zitnick (MSR), Piotr Dollár (MSR)
Overview
Method
Goal: learn and detect local contour-based
representation for mid-level features
Defining Sketch Tokens
Detecting Sketch Tokens
Given a set of sketch token classes, our goal is to detect
them in color images.
We are given a set of images, I, and its
corresponding set of binary contour images, S.
Sketch Tokens:
• Local edge structures (e.g. straight lines,
t-junctions, y-junctions)
• Discovered from human-generated image
sketches
Each color patch’s ground truth class is assigned to one
of Sketch Token or background class.
Sketch Tokens are clusters of
extracted patches from the binary
contour images S.
We demonstrate our approach on both top-down and
bottom-up tasks.
t2
We used random forest classifier with various features
(e.g. CIE-LUV intensity, orientation, and self-similarity).
t4
t8
- Each patch has a fixed size of 35x35, and its
center pixel must be on a labeled contour
- 150 clusters are extracted using K-means on
Daisy descriptors computed on binary patches.
• State-of-the-art result on contour detection, while 200x
faster
• Large improvements on object and pedestrian detection.
t1
t3
t5
t9
Sketch Tokens
Contour Detection (BSDS 500)
Object Detection on PASCAL2007
INRIA Pedestrian Detection
We used Sketch Token responses (150 st + 1 bg dimension) on
images as additional features to the deformable parts model detector.
In addition to standard features used in Dollár
et. al.’s implementation, we added Sketch
Token responses.
On average, we improved 3.8 AP.
Speed
bike
bird
boat
bottle
bus
car
cat
chair
cow
LUV+M+O
10
17.2%
HOG
19.7
43.9
2.2
4.8
13.4
36.6
40.2
5.4
10.9
15.7
ST
151
19.5%
ST
17.8
41.1
4.8
5.7
11.1
31.9
33.8
5.1
10.8
16.1
ST+LUV+M+O
161
14.7%
HOG+ST
21.9
48.5
6.3
6.4
14.6
41.5
43.3
6.1
15.7
19.2
OIS
AP
Human
0.80
0.80
-
Canny
0.60
0.64
0.58
1/15s
Method
table
dog
horse
moto
person
plant
sheep
sofa
train
tv
gPb
0.73
0.76
0.73
240s
HOG
7.5
2.1
41.9
30.9
23.9
3.4
9.3
14.8
26.9
32.4
ST
7.4
3.1
32.9
27.0
20.9
4.6
8.6
10.4
18.9
26.3
HOG+ST
14.2
3.8
46.1
34.5
30.9
8.1
15.3
18.9
30.3
36.6
0.74
0.76
0.77
280s
Sketch tokens
0.73
0.75
0.78
1s
miss rate
plane
ODS
SCG
# channels
Method
Method
200x
faster!
Method
Conclusion
MATLAB code is available on the website
t6
t7
t14
t15

similar documents