Brais_Martinez_seminar

Report
A face analysis exemplar:
Face detection, landmarking and facial
expression recognition.
Dr. Brais Martinez
Slides can be downloaded from braismartinez.com
Overview
Model-free
part-based tracking
Part-based facial
landmarking
Face Analysis
PostDoc
PhD
End
2010
Research visits to:
•
•
Imperial College London (Maja Pantic)
9/2007-3/2008
Oregon State University (Sinisa Todorovic)
7/2013-10/2013
Overview
Multi-view face
detection
Facial Landmarking
Facial Action Unit Detection
[Under Review] IVC. J. Orozco, B. Martinez, M. Pantic, “Empirical analysis of cascade deformable models for multi-view
face detection”
[IF 2012: 1.96, Q1]
2010 CVPR - M. Valstar, B. Martinez, X. Binefa, M. Pantic, “Facial point detection using boosted regression and graph models”
2013 TPAMI - B. Martinez, M. Valstar, X. Binefa, M. Pantic, “Local evidence aggregation in regression-based facial point detection”
[Under Review] CVIU - B. Martinez, M. Pantic, “Facial landmarking for in-the-wild images with local inference based on global appearance”
2014 TSMCB - B. Jiang, M. Valstar, B. Martinez, M. Pantic, “A dynamic appearance descriptor approach to facial actions temporal modelling”
[Under Review] IJCV - B. Jiang, B. Martinez, M. Valstar, M. Pantic, “Automatic analysis of facial actions: A survey”
[Under Review] ICPR - B. Jiang, B. Martinez, M. Valstar, M. Pantic, “Decision level fusion of domain specific regions for facial action
recognition”
Face detection using cascaded DPM
Part-based model
The Deformable Parts Model (DPM):
•
Object composed of parts
•
Current state-of-the-art model in object detection
•
Weakly-supervised
•
Uses Linear SVM (we used 35k+ training images!)
•
Very efficient implementations (both training and
testing)

Score 0 , …  =
0 object loc.
 parts loc.

  ,  −
=0
Convolve filter 
with gradient im.
   ,  + 
=1
Penalise
deformations
Cascaded DPM
Non-frontal poses: Mixture model
Root Filter
Part Filters
Speed: cascaded search
Part Locations


  ,  −
=0
   ,  + 
Full score
=1
0  , 0 >
th0
Score
1 part
No
1
=0 
Scale: Multi-scale sliding window
 ,  −
   ,  > th1
No
Score
2 parts
Results: DPM face detection
True Positive Rate
Dataset: AFLW
Proposed
Zhu&Ramanan
Multiview V&J
False Positive Rate
Advantages over Zhu & Ramanan:
•
•
•
•
Only face bound annotations needed
Better for lower resolution
5 parts instead of 66
Cascade detection
Overview
Multi-view face
detection
Facial Landmarking
Facial Action Unit Detection
[Under Review] IVC. J. Orozco, B. Martinez, M. Pantic, “Empirical Analysis of Cascade Deformable Models for Multi-view Face Detection”
2010CVPR - M. Valstar, B. Martinez, X. Binefa, M. Pantic, “Facial Point Detection using Boosted Regression and Graph
Models”
[81 citations]
2013TPAMI - B. Martinez, M. Valstar, X. Binefa, M. Pantic, “Local Evidence Aggregation in Regression-based Facial Point
Detection”
[IF 2012: 4.80, Q1]
[Under Review] CVIU - B. Martinez, M. Pantic, “Facial landmarking for in-the-wild images with local inference based on
global appearance”
[IF 2012: 1.23, Q3]
2014TSMCB - B. Jiang, M. Valstar, B. Martinez, M. Pantic, “A dynamic appearance descriptor approach to facial actions temporal modelling”
[Under Review] IJCV - B. Jiang, B. Martinez, M. Valstar, M. Pantic, “Automatic Analysis of Facial Actions: A Survey”
[Under Review] ICPR - B. Jiang, B. Martinez, M. Valstar, M. Pantic, “Decision Level Fusion of Domain Specific Regions for Facial Action
Recognition”
Part-based facial landmarking
Classical part-based:
Construct
response maps
Train:
1 classifier per point
(e.g. logistic classifier)
Test:
Construct response map
(sliding window over ROI)
Do regression!
Maximise response constrained to feasible shape
(constrained gradient ascent )
CVPR 2010 – Facial Point Detection using Boosted
Regression and Graph Models
Constrained gradient ascent
Regression for Localisation

Current
estimate
BoRMaN algorithm
Face Detection
Δx
Δy
Obtain prior location

Ground
truth
(starting point)
Eval. regressors
(new location hypotheses)
Regression:
: ℝ ⟶ ℝ
HOG
w hile it  it 0
Correct hypothesis

 =  +   ,   = +  , 
Multiple Regression Methodologies:
Least Squares, SVR, GP, random forests…
(shape restrictions)
Output
MRF-based shape model
•
•
Detect bad estimations
Propose an alternative
Shape model
Relations are rotation and scale independent
Angle α between
segments


∗∗

∗∗
  , 
  , 
Ratio ρ between
segment lengths
Regression-based landmarking
Major improvements:
Established a trend: Best performing nowadays!
Prediction accumulation/voting
Facial landmarking using regression:
2010:
CVPR
2012:
CVPR (Microsoft Res.)
CVPR (ETH, Van Gool)
ECCV (Manchester Univ.– Cootes)
2013:
TPAMI (iBug)
CVPR (CMU)
CVPR (iBug)
ICCV (Microsoft Res.)
ICCV (QMUL)
2013 TPAMI – Martinez, Valstar, Binefa, Pantic
Cascaded regression
…
Regression: Vote aggregation
What if we are too far from the target? What if we have bad predictions?
Errors ≈Uniformly distributed
do NOT accumulate
Errors ≈Gaussian distributed DO
accumulate
Base of the algorithm:
Accumulate predictions, a prediction being a small Gaussian
LEAR algorithm
Overview
Multi-view face
detection
Facial Landmarking
Facial Action Unit Detection
[Under Review] IVC. J. Orozco, B. Martinez, M. Pantic, “Empirical Analysis of Cascade Deformable Models for Multi-view Face Detection”
2010CVPR - M. Valstar, B. Martinez, X. Binefa, M. Pantic, “Facial Point Detection using Boosted Regression and Graph Models”
2013TPAMI - B. Martinez, M. Valstar, X. Binefa, M. Pantic, “Local Evidence Aggregation in Regression-based Facial Point Detection”
[Under Review] CVIU - B. Martinez, M. Pantic, “Facial landmarking for in-the-wild images with local inference based on global appearance”
2014TSMCB - B. Jiang, M. Valstar, B. Martinez, M. Pantic, “A dynamic appearance descriptor approach to facial actions
temporal modelling”
[IF 2012: 3.24, Q1]
[Under Review] IJCV - B. Jiang, B. Martinez, M. Valstar, M. Pantic, “Automatic Analysis of Facial Actions: A Survey”
[IF 2012: 3.62, Q1]
[Under Review] ICPR - B. Jiang, B. Martinez, M. Valstar, M. Pantic, “Decision Level Fusion of Domain Specific Regions for
Facial Action Recognition”
Action Unit detection – what is it about?
Facial expression recognition
Message judgment:
Directly decode the meaning of the expression
•
6 universal expressions: happiness, anger,
sadness, fear, surprise, disgust
(constant message to sign relation)
• Pre-segmented episodes
Sign judgment:
Study the physical signals composing the expression
•
•
•
•
•
An AU relates to the activation of a facial muscle
“Agnostic” (not concern about “knowing” the message)
Can represent any expression
Reasoning upon needed to understand
Frame-based labelling
Facial Action Coding System is the most common sign
judgment approach.
Happiness?
Pain?
Action Unit analysis: what and why
Research problems within the field:
• AU detection (per-frame)
• AU intensity estimation
• AU temporal segment detection
• AU correlations (for structured prediction)
• Semantics of AUs
What do they allow (that normal facial expression analysis does not):
• Pain detection
• Deceit detection
• Detection of social signals (conflict, agreement/disagreement,…)
How Action Unit detection is done
Pre-processing
Feature extraction
Appearance
Machine Analysis
SVM, ANN, Boosting…
Face detection
Facial landmark detection
Dynamic
Registration
T
Non-ref.
affine Geometric
Trans.
1
2
Graph models (label consistency)
1′
2′
[Under Review] IJCV - Jiang, Martinez, Valstar, Pantic, “Automatic Analysis of Facial Actions: A Survey”
TOP features
Three orthogonal planes (TOP):
Extension to spatio-temporal volumes of histogram features
Markov Model over
temporal segments
Neut
Onset
Representing the face
Allows: analysis of AU temporal segments
Apex
Offset
2014 TSMC-B - “A dynamic appearance descriptor approach to facial actions temporal modelling”
Publications
[Under Review]
B. Jiang, B. Martinez, M. Valstar, M. Pantic, “Automatic Analysis of Facial Actions: A Survey”. International Journal
of Computer Vision [IF 2012: 3.62, Q1]
J. Orozco, B. Martinez, M. Pantic, “Empirical Analysis of Cascade Deformable Models for Multi-view Face
Detection”, Image and Vision Computing [IF 2012: 1.96, Q1]
B. Martinez, M. Pantic, “Facial landmarking for in-the-wild images with local inference based on global
appearance”, Computer Vision and Image Understanding [IF 2012: 1.23, Q3]
B. Jiang, B. Martinez, M. Valstar, M. Pantic, “Decision Level Fusion of Domain Specific Regions for Facial Action
Recognition”, Int. Conf. on Pattern Recognition, 2014
[Journals]
2014 B. Jiang, M. Valstar, B. Martinez, M. Pantic, “A dynamic appearance descriptor approach to facial actions
temporal modelling”, In IEEE Tans. on System Man and Cybernetics – Part B [IF 2012: 3.24, Q1]
2013 B. Martinez, M. Valstar, X. Binefa, M. Pantic, “Local Evidence Aggregation in Regression-based Facial Point
Detection”, In IEEE Trans. on Pattern Analysis and Machine Intelligence [IF 2012: 4.80, Q1]
2013 S. Petridis, B. Martinez, M. Pantic, “The MAHNOB Laughter Database”, In Image and Vision Computing
Journal [IF 2012: 1.96, Q1]
2011 M. Vivet, B. Martinez and X. Binefa, “DLIG: Direct Local Indirect Global Alignment for Video Mosaicing”, In
IEEE Trans. on Circuits and Systems for Video Technology [IF 1.65, Q2]
2008 B. Martinez, X. Binefa, “Piecewise affine kernel tracking for non-planar targets”, In Pattern Recognition
[IF: 3.28, Q1]
[Conferences]
2010 M. Valstar, B. Martinez, X. Binefa, M. Pantic, “Facial Point Detection using Boosted Regression and Graph
Models”, In IEEE Int’l Conf. on Computer Vision and Pattern Recognition [27% acceptance rate, 81 citations]
2010 B. Martinez, X. Binefa, M. Pantic, “Facial Component Detection in Thermal Imagery”, In IEEE Int'l Conf.
Computer Vision and Pattern Recognition - Workshops
Thanks!

similar documents