ppt - Soft Computing Lab.

Report
SOMM: Self Organizing Markov Map for Gesture
Recognition
G. Caridakis et al., Pattern Recognition,
Vol. 31, pp. 52-59, 2010.
Pattern Recognition
2010 Spring
Seung-Hyun Lee
Contents
• Introduction
• Related Work
– Hidden Markov Models
– Other Method
• Proposed Method
• Experiments
• Conclusion
1
S
FT COMPUTING @ YONSEI UNIV . KOREA
16
Introduction
• Gesture
– A motion of the body that conveys information
• In this paper
– Focus on hand gestures
2
S
FT COMPUTING @ YONSEI UNIV . KOREA
16
Introduction
• Taxonomy of gesture(McNeill, 1992)
–
–
–
–
–
Gesticulation
Speech-linked
Pantomime
Emblems
Sign Languages
• Other (Kendon,1992) (Quek, 1994)
3
S
FT COMPUTING @ YONSEI UNIV . KOREA
16
Introduction
• Taxonomy by functionality
Gestures
Definition
Symbolic gestures
gestures that, within each culture, have come to have a single meani
ng.
Deictic gestures
types of gestures most generally seen in HCI and are the gestures of
pointing to entities or direction.
Iconic gestures
gestures used to convey information about the size, spatial relations,
actions, shape or orientation of the object of discourse display.
Pantominic gestures
gestures typically used to mimic an action, object or concept.
4
S
FT COMPUTING @ YONSEI UNIV . KOREA
16
Hidden Markov Model
Related Work
• Cogan(2006)
– Discrete HMM which fuse hand shape and position
• Hossain(2005)
– Implicit/Explicit Temporal Information Encoded HMM
– Discriminated attention and non-attention gestures
• Mantyla(2000)
– On mobile devices
– Utilized SOM and HMM method
• Starner(1998)
– HMM based American Sign Language(ASL) recognition
– Sentence level recognition is possible
5
S
FT COMPUTING @ YONSEI UNIV . KOREA
16
Other method
Related Work
• Black and Jepson(1998)
– CONDitional dENSity propagATION (CONDENSATION) algorothm
• Wong and Ciipolla(2006)
– Sparse Bayesian classifier
• Hong et al.(2000)
– Finite State Machines(FSM)
• Su(2000)
– Fuzzy logic and rule-based approaches and hyper-rectangular
composite Neural network(HRCNNs)
• Juang and Ku(2005)
– Fuzzified Takagi-Sugeno-Kang(TSK) type recurrent network
• Yang et al.(2002)
– Time Delay Neural network
• Huang and Huang(1998)
– 3D Hopfield Neural Network
6
S
FT COMPUTING @ YONSEI UNIV . KOREA
16
Overview
Proposed Method
• Modules
– Image processing
: detection an tracking of hands
– SOM
: quantization of
hand location and direction
– HMM
: transition probability matrix
7
S
FT COMPUTING @ YONSEI UNIV . KOREA
16
Feature Extraction
Proposed Method
• Video based method
– Creation of moving skin masks (Skin color area)
– Tracking the centroid of the skin masks
– Prior knowledge is required
• It should indicate different body parts (Left, right hand, and head)
• Environment
– PC platform
– OpenCV
8
S
FT COMPUTING @ YONSEI UNIV . KOREA
16
Proposed Method
• Dataset
• Gesture instances
• Gesture instances
9
S
FT COMPUTING @ YONSEI UNIV . KOREA
16
Position Model
Proposed Method
• cf) SOM
(1) continuous input space
(2) discrete output space in the form of lattice
(3) time-varying neighborhood function defined around winning
neuron
(4) decreasing learning rate parameter
10
S
FT COMPUTING @ YONSEI UNIV . KOREA
16
Position Model
Proposed Method
• Some based representation of hand position
11
S
FT COMPUTING @ YONSEI UNIV . KOREA
16
Direction Model
Proposed Method
• Additional information: Moving direction
12
S
FT COMPUTING @ YONSEI UNIV . KOREA
16
Generalized Median
Proposed Method
• Based on Levenshtein distance(edit distance)
– Measuring the amount of difference between two sequences
• Generalized median of data set Mj
• Mean Levenstein distance between members
13
S
FT COMPUTING @ YONSEI UNIV . KOREA
16
Gesture Decoding
Proposed Method
• Position
– Probability
– Calculation of Ssom
• First state: initial probability
• From second state: transition probability
– Unit u
14
S
FT COMPUTING @ YONSEI UNIV . KOREA
16
Gesture Decoding
Proposed Method
• Direction
– Probability
– Calculation of Sof
– Unit u
15
S
FT COMPUTING @ YONSEI UNIV . KOREA
16
Gesture Decoding
Proposed Method
• Similarity measurement
– Problem
• Shorter gesture instances tend to gain an advantage by having less transitions and
thus less probabilities multiplication
– Measurement
16
S
FT COMPUTING @ YONSEI UNIV . KOREA
16
Error Propagation
Proposed Method
• Error definition for function f
• SOM based approach
– If data containing small error is mapped to the same node of SOM
 No problem
– Otherwise
 Consequently, because of neighboring relation of u, error is not
propagated to the next steps of the recognition process
17
S
FT COMPUTING @ YONSEI UNIV . KOREA
16
Data Set
Experiment
• 30 gestures 10 repetitions each
18
S
FT COMPUTING @ YONSEI UNIV . KOREA
16
Result
Experiment
• SOM clustering
– Blue: close to input vector
– Red: not close
• Recognition accuracy
– Test with training data: 100%
– 10-fold cross validation: 93%
0.843 ms for decoding a gesture
– Only HMM-based classifier: 86.36%
19
S
FT COMPUTING @ YONSEI UNIV . KOREA
16
Conclusion
• Key features
– SOM and HMM based automatic recognition architecture
– ROI
• Relative hand position
• Moving direction
• Similarity of pattern
• Application
– Sign language
– Gaming environment
20
S
FT COMPUTING @ YONSEI UNIV . KOREA
16
Thank you

similar documents