FAST MODE DECISION IN H264/AVC VIDEO CODEC

Report
FAST MODE DECISION IN
H264/AVC VIDEO CODEC
NIRANJAN MULAY (0393251)
CHEN GAO(0401840)
(EL6123: PROJECT PRESENTATION)
05/06/2010
Outline:






Introduction to H.264/AVC coding standard
Mode decisions in H.264/AVC
- Intra Block
- Inter Block
RDO algorithm and the need for FMD
FMD (for Intra and Inter)
Literature survey: edge-map based FMD
Study of x264 code and encoding options
Implementation:
-Generation of MB mode statistics file from X264
-Visualize the modes in Matlab
-Intra FMD; Inter FMD
Summary and future work
Introduction to H.264/AVC Coding
Standard
The key features of H.264:

Improved Intra prediction: Directional spatial prediction

Enhanced Temporal Prediction:
-Motion compensation with variable block sizes from 4x4 to 16x16:
reduces ‘prediction error’
-Quarter-pel accurate motion estimation
-Multiple reference for motion estimation
-Weighted prediction (for B and P frames)

DCT-like integer transform: No mismatch between encoder and
decoder
Introduction to H.264/AVC Coding
Standard(Cntd)




Efficient entropy coding:
-Uses arithmetic entropy coding, has option for VLC coding
-Context adaptive entropy coding: 2 options – CAVLC and CABAC
Variable size (primarily 4x4 along with 8x8,16x16) transform:
- Smaller size helps to represent a signal in locally adaptive
manner which reduces ringing artifacts.
- Generally high frequency=> 4x4 and low frequency=> 16x16
In-loop deblocking filter: Reduces blocking artifacts, improves
quality.
Special Error Resilient Tools
H.264 Intra Modes:



Intra 4x4 : useful for a MB with significant detail
Intra 16x16 : good for coding very smooth areas
(Intra 8x8 chroma: similar to intra 16x16)
I_PCM : no prediction or transform
‘Intra 16x16’:




Mode 0 (vertical): extrapolation from upper samples.
Mode 1 (horizontal): extrapolation from left samples.
Mode 2 (DC): mean of upper and left-hand samples.
Mode 3 (Plane): plane prediction based on a linear
spatial interpolation by using the upper and left-hand
samples of the MB.
‘Intra 4x4’:
Figure:4x4 luma prediction mode
Intra 4x4(Cntd):









Mode 0: Vertical
Mode 1: Horizontal
Mode 2: DC prediction
Mode 3: Diagonal down-left
Mode 4: Diagonal down-right
Mode 5: Vertical-right
Mode 6: Horizontal-down
Mode 7: Vertical-left
Mode 8: Horizontal-up
H.264 Inter Modes:

Hierarchical
Decision
Level-1 (Partition):
Compute RD-cost for:
16x16, 16x8, 8x16, 8x8.
 Level-2 (Sub-Partition):
If level-1 => 8x8,
Then, compute RD cost of
8x4,4x8 and 4x4
Select the most optimal block!
 P_Skip Mode

RDO Algorithm

Formula: RD_cost(s,c,MODE|Qp) = D +  . R

------------------------------------------------------------------------------

Computational Complexity of brute-force RDO:


INTRA block:
Total Modes = 4 (16x16) + 9 (4x4) + 1 (I_PCM) + 4 (chroma_8x8) = 18
Total # of RDO calculations = M8 * ( M4*16 + M16)
Theoretical Bound for a MB: 4 x (9x16+4)=592!
INTER block:
Total Modes = [ 7+1 (P_SKIP) ] + Intra counterparts
HUGE Computations!!
Problem for real time application => So, Need
of FMD!
FMD-Intra : Edge-Histogram approach

Main Idea: Use Prediction in Edge Direction
Generate edge map using Sobel operator
Build edge direction histogram
Fast intra mode decision
Generate Edge Map

Sobel Operator (Compute Gradients):
dx i , j  p i 1, j 1  2 p i , j 1  p i 1, j 1  p i 1, j 1  2 p i , j 1  p i 1, j 1
dy i , j  p i 1, j 1  2 p i 1, j  p i 1, j 1  p i 1, j 1  2 p i 1, j  p i 1, j 1
A m p ( D i , j )  dx i , j  dy i , j
A ng ( D i , j ) 
180

o
arctan(
dy i , j
dx i , j
), A ng ( D i , j )  90
o
Edge Direction Histogram for Intra_4x4
90
8
if { ( dx i , j  0 & dy i , j  0) or (   5.027 )}
 11.25
o
histo (0)   A m p ( D i , j )
elseif (   0.199)
histo (1)   A m p ( D i , j )
elseif (0.199    0.668)
histo (6)   A m p ( D i , j )
.....
 
dy i , j
 tan( A ng )
dx i , j
elseif (  1.497     0.668)
histo (3)   A m p ( D i , j )
tan(11.25 )  0.199
o
elseif (  0.668     0.199)
histo (8)   A m p ( D i , j )
FMD for Intra_4x4 Contd…
As per observations in Reference[5]:
- The ideal 4x4 mode is either the primary mode or
one of the two neighboring modes
- DC mode (Mode 2) is always evaluated
- Total Modes = 1(Prime) + 2 (neighbors) + DC = 4
Edge Direction Histogram for
Intra_16x16
f (   2 .4 1 4 )
h isto (0 )   A m p ( D i , j )
elseif (   0 .4 1 4 )
h isto (1)   A m p ( D i , j )
else
h isto (3)   A m p ( D i , j )
Total Modes =
1(Prime) + DC = 2
Fast Mode Decision-Inter

Main idea: If we can reasonably decide that MB is
temporally stationary or spatially homogeneous, we
can encode MB using larger block-size and safely
skip all other modes!
Stationary Region Determination

Refers to the stillness between consecutive frames in
the temporal dimension
Evaluate Zero-MV Diff : D iff   abs ( M (i , j )  N (i , j ))
16 ,16

i  1, j  1
If (Diff < Threshold Ts) => “Stationary”
So, choose16x16 mode and skip other sizes !
 Threshold Ts = 200 (Reference[6])

Homogeneous Region Determination


Refers to texture similarities inside a single video
frame
Edge amplitude computation A m p ( D )  dx  dy
is
already done in fast intra mode decision
i, j
1, if

A m p ( D i , j )  T hd H

A m p ( D i , j )  T hd H
i , j N  N
H
r ,c

0, if
i , j N  N

Threshold values (Reference[6]):
for 16x16 block : 20000
for 8x8 block : 5000
i, j
i, j
Flow Chart of FMD_Inter
Wait...
Changing the mode:Theory to Practice!
H.264/AVC Profiles

H264/AVC Profiles
Q. What is X264 ?

‘x.264’ :

Open source H264/AVC encoder by VideoLAN
‘C’ code library, Platform : Linux
Optimized as compared to reference JSVM software

Bunch of encoding options!


We finalized the options for “benchmarking” performance of NonFMD vs FMD case
E.g.:
Command to encode ‘foreman_qcif.yuv’ sequence…

./x264 -o foreman_qcif.264 foreman_qcif.yuv 176x144 -profile baseline --frame 30 --verbose --keyint 15 --minkeyint 15 --no-scenecut --bframes 0 --ref 1 --slices 1 --fps
15 --qp 25 --partitions all --weightp 0 --me esa --subme 7 -no-chroma-me --no-8x8dct --trellis 0 --no-fast-pskip -visualize
X264 Coding Options:









--keyint 15/--min-keyint 15: Sets GOP size to 15
--bframes 0: Disables B-frame
--slices 1: Sets 1 slices per frame
--ref 1: Only 1 frame can be used as reference
--me esa: Select exhaustive motion estimation
--no-chroma-me: Ignore chroma in motion estimation
--qp 25: Fixed quantization step-size
--partitions all: Do all possible partitions
--no-scenecut: Disables adaptive I-frame decision
Implementation I:
‘Generation of Mode Statistics’




Intra MB: 3 Types :: I_4x4=0 ( 11 Modes), I_16x16=2 (4 Modes), I_PCM=3,
Inter MB: 3 Types :: P_L0=4, P_8x8=5, P_SKIP=6
P_LO (Level-1): can have 3 Partitions: D_16x8=14, D_8x16=15, D_16x16=16
P_8x8 (Level-2): has D_8x8 partition and can have 4 Sub-partitions:
D_L0_8x8=3, D_L0_4x4=0, D_L0_8x4=1, D_L0_4x8=2
Implementation II: ‘Visualization Utility’
I-Frame
RED : Intra_4x4
CYAN: Intra_16x16
P-Frame
GREEN: P_SKIP
BLUE: P_8X8 (and below)
MAGENTA: P_16x16,P_16x8, P_8x16
Motive: “Seeing is Believing  !”
Let’s see a Demo…
Key observations:
I- Frame:
 16x16 size chosen for spatially homogeneous region
 4x4 size chosen for a MB with many spatial details/local edges
------------------------------------------------------------------------------------ P-Frame:

% of Skipped
% of Inter
% of Intra
Akiyo
78.2
21.8
0
Football
6.6
81
12.4
Foreman
17.5
81.9
0.6
Contd…
Though H.264 allows variable size MC up-to 4x4 size…

Real world video sequences: Certain percentage of ‘Skipped’ blocks

Spatially Homogeneous regions gets best compensated with 16x16
(such blocks have similar motion; very seldom split to smaller blocks)


Temporally Stationary blocks ( e.g. stationary background even with
strong edges) gets best compensated with 16x16 or P_SKIP
Nonetheless,
Blocks containing motion boundaries or motion in smaller objects
benefit from 8x8 or 4x4 MC
Implementation III: FMD Intra in x264
~1000 lines of C code:
Edge Map computation,
Prime mode computation based on histogram,
Modification of mode decision logic in .x264
 Number of candidate modes in Intra-FMD:

Block Size
Total # of modes
# of modes
selected
Luma(Y)
4x4
9
4
Luma(Y)
16x16
4
2
Chroma(U,V)
8x8
4
3 or 2
Results: Intra FMD (All I frames, Qp=25)
RESULTS
△TIME(%)
Mobile
-30.22
Akiyo
-35.81
Paris
-39.15
Foreman
-38.88
Football
-38.84
△PSNR_Y
△PSNR_V
-0.022
0.007
0.006
-0.181
-0.091
-0.091
-0.067
0.006
-0.02
-0.154
-0.014
-0.042
-0.084
-0.066
-0.068
Avg. Time Saving: 36.70%
Avg. PSNR drop: 0.11 dB
△PSNR_U
△PSNR_AVG
-0.016
-0.165
-0.055
-0.125
-0.198
Results: Intra FMD (PSNR vs R)
Sequence: Mobile, Coding: All I, Qp= 37,33,29,25
Avg PSNR drop: 0.044 dB, Avg. Increase in R: ~6%, Avg Time Saving: 37.51%
Summary and future work:
To Conclude:




Learnt x264 code-flow, different encoding options
Matlab ‘mode visualization script’ is ready
Intra-FMD ready, Inter-FMD (in progress)
Important: FMD framework is ready! Different FMD algorithms can
be plugged in to evaluate prime mode selection…
Future Work:


Inter FMD
FMD enhancement: Analysis of different modes with conditional
probabilistic model
Reference







[1] URL: http://www.videolan.org/developers/x264.html
[2] Thomas Wiegand, Gary J Sullivan, “Overview of the H264/AVC Video Coding
Standard”, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 13,
No. 7,July 2003
[3]URL: http://www.vcodex.com/files/H.264_overview.pdf White Paper: An
Overview of H.264 Advanced Video Coding
[4] Iain E G Richardson, “H.264 and MPEG4 Video Compression”, WILEY
Publications, 2003
[5] Feng Pan et al, “Fast Mode Decision Algorithm for Intra-prediction in
H264/AVC Video Coding”, IEEE Transactions on Circuits and Systems for Video
Technology, Vol. 15, No. 7,July 2005
[6] D. Wu et al, “Fast Intermode Decision in H264/AVC Video Coding”, IEEE
Transactions on Circuits and Systems for Video Technology, Vol. 15, No. 6,July 2005
[7] Rui Su, Guizhong Liu, Tongyu Zhang,”Fast Mode Decision Algorithm for Intra
Prediction In H264/AVC”, ICASSP-2006

similar documents