### DengWH

```Extended Sparse Linear Model for
Face Recognition
Weihong Deng (邓伟洪)
Beijing Univ. Post. & Telecom.(北京邮电大学)
Characteristics of Face Pattern
•
•
The facial shapes are too similar, sometimes identical ! (~100% face detection
rate, kinship verification)
A Special Object: Easy for both detection and identification !
• Within-class variation is larger than between-class variation
• Even faces looks similar, ~100% accuracy are achieved by linear
classifiers on public databases (although not for real-world applications)
2
Learn linear structures in high-dimensional image space
• Challenges
•
•
•
•
Geometry: Describe low-dimensional structures in high-dimensional data?
Statistics: Deal with real data that contain large intra-class variations, such
as pose, illumination, and expression (PIE), or occlusion.
Learning: Handle insufficient data, i.e. small (single) sample size problem
Computation: implement the real-time recognition system !
3
Linear Model is universal in (Visual) Signals
•
JPEG: Linear representation with pre-defined basis images
•
Sparse coding: Linear representation with learned basis images
4
Outline: Two Historical Linear Model and their extensions
Principal Component Analysis
(Eigenfaces)
Sparse representation-based Classification
(SRC)
Basis images
Prototype images
y=Ax
5
Eigenfaces: A human psychophysics experiment
reconstructed images
Orthgonal Basis images
• Eigenfaces reveals that the intrinsic dimension of the face space is ~100.
• Intrinsic dimension: Human observer can recognize face with SNR~78, which requires as low as 100-200 dimension eigenspace.
Marsha Meytlis and Lawrence Sirovich, On the Dimensionality of Face Space, PAMI 2007
6
Face Recognition via Sparse Linear Representation
Face space
subspace of the same face
• Illumination Theory: Images of the same face under varying illumination lie
approximately on a low (nine)-dimensional subspace, known as the
harmonic plane [Basri & Jacobs, PAMI, 2003].
• Inspire a new linear model called sparse representation based
classification (SRC) [PAMI 2009]
7
Face Recognition via Sparse Representation
Assumption: the test image,
,
linear combination of k training images, say
, can be expressed as a
of the same subject:
The solution,
,
, should be a sparse vector —
of its
entries should be zero, except for the ones associated with the correct subject.
Reference: Wright et al. Robust Face Recognition via Sparse Representation. PAMI, 31(2):210–227, 2009.
8
Face Recognition via Sparse Representation
123…
subject 1…
N
subject i
123…
Sparse representation
encodes membership through its
nonzero coefficients!
subject n
N
subject i
Classification criterion: assign to the class with the smallest residual.
9
Limitations of Sparse Representation based Classification
SRC assumes that the training images have been carefully controlled and that
the number of samples per class is sufficiently large.
•
•
Outside these operating conditions,
SRC should not be expected to
perform well (Wright et al.).
Many real world applications are
outside these operating conditions.
However, when the sample size is small, the sparsity would be break down!
Test image
Training dictionary
coefficients
Reference: Wright et al.. Sparse representation for computer vision and pattern recognition. Proceedings of the IEEE, 98(6):1031–
1044, 2010.
10
Previous works (How to solve SSS)
 [1] Fisherfaces (Linear Discriminant Analysis)
 The feature covariance of all classes are identical
 The within-class scatter matrix is shared by all classes for
discriminant analysis
 [2] Quotient image
 All faces share identical 3D shape
 Render all lighting conditions of a novel face from the training
images of other subjects.
 The within-class variation of the gallery subjects can be
represented by out-of-gallery subjects.
 Adapt the within-class scatter matrix for DA on gallery samples
1. Peter N. Belhumeur, Joao P. Hespanha, and David J. Kriegman, Eigenfaces vs. Fisherfaces:
Recognition Using Class Specific Linear Projection, PAMI1997
2. Amnon Shashua, and Tammy Riklin-Raviv, The Quotient Image: Class-Based Re-Rendering and
Recognition with Varying Illuminations, PAMI2001
11
3. Meina Kan, Shiguang Shan, Yu Su, Xilin Chen, Wen Gao: Adaptive discriminant analysis for face
recognition from single sample per person. FG 2011
Observation
 Human faces share similar shape.
 Key assumption: the intra-class of any
gallery face can be approximated by a
linear combination of the intra-class
difference from sufficient number of
generic faces
m
x  xi    j j
j 1
=?
12
1. Deng et al., Extended SRC: Undersampled Face Recognition via Intra-Class Variant Dictionary, PAMI 2012
2. Deng et al., In Defense of Sparsity Based Face Recognition, CVPR, 2013.
Extended Sparse Representation
•
Two novel assumptions:
1. Image can be superposed by prototype and variance dictionaries;
2. Intra-class variant bases are shared across classes.
Prototypes
Variance
Reference: Deng et al., Extended SRC: Undersampled Face Recognition via Intra-Class Variant Dictionary, PAMI 34(9): 1864-1870, 2012.
13
Recognition with single sample per class
Test variability
Dramatically reduce the error rates
Intra-class variance dictionary
Intra-class variance dictionaries of a single face
can reduce the error rate by nearly a half !
Reference: Deng et al., Extended SRC: Undersampled Face Recognition via Intra-Class Variant Dictionary, PAMI 34(9): 1864-1870, 2012.
14
Recognition with uncontrolled training samples
•
•
Largely boost the accuracy with uncontrolled training samples
Outperform methods with complicated dictionary learning !
Reference: Deng et al., In Defense of Sparsity Based Face Recognition, CVPR, 2013.
15
Recognition with over-complete intra-class variance dictionary
•
•
•
Construct over-complete variance dictionary from the image difference of the
FRGC training set (12,766 images).
Comparing the effects of L1 (sparsity) and L2 (non-sparsity) regularization for
computing the linear combination.
L1 (sparsity) constraint leads to much better recognition accuracies.
Reference: Deng et al., In Defense of Sparsity Based Face Recognition, CVPR, 2013.
16
Extended works of ESRC
• Yi Ma
• Single-Sample Face Recognition with Image Corruption and
Misalignment via Sparse Illumination Transfer (CVPR 2013, IJCV
2014)
• Neither Global Nor Local: Regularized Patch-Based Representation
for Single Sample Per Person Face Recognition (IJCV 2014)
• Lei Zhang
• Sparse Variation Dictionary Learning for Face Recognition with A
Single Training Sample Per Person (ICCV 2013)
• Xudong Jiang
• Sparse And Dense Hybrid Representation via Dictionary
Decomposition for Face Recognition, PAMI 2015
17
From ESRC to Metric Learning
 Metric space is the “identity space” with parameter-free
learning
T
dW x, x'  x  x' W TW x  x'
 Gallery images are mapped to [1 0 0], [0 1 0] and [0 0 1],
respectively. Generic facial variations are all mapped to [0
0 0]
1. Gallery images
2. Basis images of the metric space before
transfer learning
3. Basis images of the metric space after
transfer learning
18
Deng et al., Equidistant Prototypes Embedding for Single Sample Based Face Recognition with Generic Learning and
Incremental Learning, Pattern Recognition, 2014.
Excellent Applicability and Surprisingly good results
1. Parameter free
2. Super simple and fast, handle h-d
feature in real time
3. Generic learning to address small
sample size problem
4. Higher flexibility: on-line update with
identical recognition accuracy
5. Consistently better performance than
SRC with any number of samples per
class
LRA vs SRC
Comparative recognition accuracy via
Batch vs. Online learning
Comparative time via Batch vs. Online learning
Reference: Deng et al., Equidistant Prototypes Embedding for Single Sample Based Face Recognition with Generic Learning and
Incremental Learning, Pattern Recognition, 47(12): 3738–3749, 2014
19
Observations
 Accuracy improvement by transfer metric learning from other
databases is significant.
 Transfer metric learning from images of 5 faces can reduce the
recognition errors by a half.
 The efficiency of Transfer is much higher than other tasks
Cross-database Transferred Metric
Learning
Error rate as a function # class of face
used for transferred metric learning
1. Deng et al., Extended SRC: Undersampled Face Recognition via Intra-Class Variant Dictionary, PAMI 2012
2. Deng et al., Equidistant Prototypes Embedding for Single Sample Based Face Recognition with Generic Learning and
Incremental Learning, Pattern Recognition, 2014.
Underestimated Core Problem of Face Recognition
Curse of Misalignment
•
•
•
•
Underestimated issue: There is no criterion to define a good aligned face, current
research works manually align the face in heuristic manner.
Most frontal representation/recognition research align face by feature points
Curse of misalignment: Many representation/recognition errors in practice is caused by
misalignment !
Our conjecture: Alignment by image plane is more stable than feature points
Shan et al., Curse of Misalignment in face recognition: Problem and a novel misalignment learning
solution, FG 2004.
21
From Point to Plane: Alignment via eigenspace
Our idea: Transform every training image toward its projection on the eigenspace
22
Transform-invariant PCA
• TIPCA: Joint Face Representation and Registration
23
Reference: Deng et al., Transform-Invariant PCA: A Unified Approach to Fully Automatic Face Alignment, Representation, and
Recognition. PAMI 36(6): 1275–1284, 2014.
Fully Automatic Registration, Representation, and Recognition
•
•
Our belief: There is a underlying relationship among image Registration,
Representation, and Recognition.
We only do a very simple work (TIPCA) relying on this profound relationship.
Reference: Deng et al., Transform-Invariant PCA: A Unified Approach to Fully Automatic Face Alignment, Representation, and
Recognition. PAMI 36(6): 1275–1284, 2014.
24
Better Registration Improves Representation
PCA (Manual alignment)
TIPCA (automatic alignment)
20
“deblured” basis images of eigenspace
40
60
80
100
Clearer Reconstructed images
MSE decrease continuously with 100 dimension eigenspace
Larger SNR
25
Better Representation Improves Recognition
• Recognition results on FERET
•
•
•
•
gallery (training): 1196 subject (one image per subject)
fb: expression variation; fc: lighting variation;
dup1: time interval < 18 months;
dup2: time interval > 18 months.
Standard FERET Database
TIPCA-aligned faces are more suitable for the recognition purpose than the manually aligned faces
Reference: Deng et al., Transform-Invariant PCA: A Unified Approach to Fully Automatic Face Alignment, Representation, and
Recognition. PAMI 36(6): 1275–1284, 2014.
26
Aligning 1000 images of 100 subjects
with real occlusion & Lighting
Reference: Deng et al., Transformed Principal Gradient Orientation for Robust and Precise Batch Face Alignment. ACCV, 2014.
27
Aligning 1000 images of 100 subjects
with real occlusion & Lighting
Manual aligned images
of Standard AR Database
1. Deng et al., Transformed Principal Gradient Orientation for Robust and Precise Batch Face Alignment. ACCV, 2014.
2. Deng et al., Transform-Invariant PCA: A Unified Approach to Fully Automatic Face Alignment, Representation, and Recognition. PAMI 36(6): 28
1275–1284, 2014.
Take-Home Messages
• Extended Linear Model (combination or projection) for
undersampled face recognition problem
• Human faces share similar shape, which makes
recognition difficult but makes knowledge transfer easy.
• TIPCA: A Unified framework for Face Registration,
Representation, and Recognition.
• Automatic alignment by image plane could be more
precise than human label of landmarks.
• Benchmark databases could be redefined to ensure the
meaningfulness to real-world application.
29