I. Introduction
With the aim of enabling advanced video compression for emerging applications (e.g., ultra-high-definition (UHD) 4K/8K TV), a High Efficiency Video Coding (HEVC) has been established recently. Compared with H.264, HEVC keeps the same video quality and meanwhile boosts the compression efficiency by around 50% [1], [2]. HEVC has great potential to satisfy these emerging time-constrained, high-resolution video coding applications. To realize superior compression efficiency, new concepts and features have been introduced in HEVC intra encoder [3]. For example, coding tree unit (CTU) in HEVC replaces macroblock in H.264. A CTU size may vary from to . Each CTU contains one or more coding units (CUs), whose sizes are allowed from up to . According to a quad-tree structure, a CU may be split into four smaller CUs. A CU is associated with its prediction units (PUs) and transform units (TUs). A PU, whose size is from to , includes luma and chroma prediction information. Discrete sine transform (DST) and discrete cosine transform (DCT) are allowed in TUs to transform prediction errors. DST deals with luma prediction residuals in TUs, while other TUs are transformed by DCT. In addition to the CTU concept, intra prediction modes increase to 35, including planar, DC and 33 directional modes. This CU/PU/TU concept and 35 prediction modes enable larger design space exploration than H.264. Despite enhanced compression efficiency, the resultant computational complexity in HEVC intra encoder is tremendous [2]–[5].