ECE 408 Final Project
Fall 2013
Groups of 3 preferred
Groups of 1-2 possible w/ prior approval
Look for a group on Piazza
2 Project options
– HEVC Intraframe Prediction Competition
– A topic from your own research
• Intraframe prediction for HEVC video encoder
• Fixed task, groups compete to see who can build
the fastest implementation
• Evaluation metric will be a weighted mixture of
PCIe I/O time and total time
• Winning team gets iPads
– Sponsored by MultiCoreWare
Intra(frame) Prediction
• Part of H.265 (aka HEVC) video format
– Successor to H.264, most popular current format
– Achieves higher PSNR with lower bitrate by using
more computationally expensive methods
• Idea: Real video frames exhibit structure
– A pixel’s color can be predicted from the color of its
neighbors within the same frame (intraframe) or from
recent frames (interframe)
– Encode a block of pixels as a prediction mode + a
residual or delta from that prediction
– Should be smaller than coding pixel values directly
HEVC Intra Prediction Modes
• Frames are processed in 4x4 – 64x64 blocks of pixels in (mostly) topleft to bottom-right order
– We can use the (previously processed) upper and left neighboring
pixels to estimate (predict) the current block of pixels
• Video consists of 1 luma and 2 chroma channels (YCC colorspace)
– 4:2:0 subsampling means luma is at 2x the x and y resolution
– Prediction is done separately for all 3 channels
• Three patterns that are seen a lot in video are flat regions, smooth
gradients, and straight edges
• We can predict a block of pixels as:
– The average of its neighbors (DC)
– A smooth gradient based on its neighbors (Planar)
– A linear extension of its neighbors in one of 33 directions (Angular)
• 35 total modes (up from 8 in H.264, DC + 8 Angular)
DC Mode
Don’t Care
Left Neighbor
Don’t Care
Top Neighbor
Current Block
• Predict that all pixels in the block are the average of
the edge pixels of top and left neighbor blocks
• Good at compressing flat regions (one color)
Planar Mode
Don’t Care
Left Neighbor
Don’t Care
Top Neighbor
Current Block
Predict that the block forms a smooth gradient defined by its top and left
Computed by average of two linear interpolation (less expensive than bilinear)
Good at compressing smoothly varying regions
Angular Modes
• 33 directions
• More coverage close to horizontal and vertical
• Those directions are more common in real video
Angular Modes
• Extend neighbor pixels into current block at specific angle
– Good at compressing areas with straight edges
• Often need to linearly interpolate between 2 neighbor pixels
• Formulated such that it can be done in integer arithmetic
Angular Modes
11% Lower
• Sum of Absolute Differences (SAD) is a simple
way of measuring the disparity between two
blocks of pixels
• Sum of Absolute Transformed Distances (SATD)
does a Hadamard transform on the differences
before summing
– More computationally complex
– Correlates better with subjective and objective (PSNR)
• SATD on an 8x8 block is commonly called SA8D
Your Task
• For 4x4, 8x8, 16x16, 32x32, 64x64 pred. blocks:
– Assume the entire frame is a regular grid
– For each luma and chroma block:
• For each of the 35 prediction modes:
– Use reference pixels directly for neighbors (no reconstruction)
– Compute predicted pixel values
– Compute SATD between prediction and reference pixels
• Return list of <mode, SATD> tuples sorted by SATD (best to worst
• Your kernel may operate on one or multiple frames
• We will provide a code skeleton and test
harness, as with the labs
• We will link to resources with high-level and
low-level explanations of intra prediction
• The existing serial and vectorized x265 code is
also a good reference
• Your code should compile cleanly and run on
the GEM cluster’s C2050s
– We may get a newer (Kepler) evaluation machine
• We will measure total prediction time and
time for memcpy()s to and from the GPU
• Final metric will be a weighted average of total
time and I/O time (exact weights TBA)
• Each member of the winning team by this
metric will receive an iPad
Additional Challenge
• Two related challenges not counted towards the
competition and course grade are also available:
– DCT Primitives
– Loop Filters
• Teams can win iPads if for one of these two
challenges if they:
Meet performance standards (TBA)
Perform better than any other team
Meet code quality standards
Contribute code to open source repository
DCT Primitives
• List of Primitives:
– Discrete Cosine Transform
– Quantization
– Dequantization
– Inverse Discrete Cosine Transform
Loop Filters
• Deblocking Filter:
– Block coding results in sharp edges in image
Courtesy of
Loop Filters
• Deblocking Filter:
– Block coding results in sharp edges in image
– DBF removes edges between blocks
Courtesy of
Loop Filters
• Deblocking Filter:
– Block coding results in sharp edges in image
– DBF removes edges between blocks
• Sample Adaptive Offset (SAO) Filter:
– Reconstruct original amplitudes using offsets
– Band filter: categorize samples into 32 bands
– Edge filter: add offsets depending on neighbors
• Infrastructure similar to competition will be
• Less support than competition
• November 31: Project Proposals due
– Only for students not doing the competition
– Oral in class (5 slides / 10 min)
• Week of November 18: Progress Reports
– Appointment with course staff (15 min)
• December 16: Final Project Presentations
• December 18: Final Project Report due

similar documents