I. Introduction
There is a trend in moving towards multi/many-core processors for real-time systems—they appear as an enabling platform for embedded systems applications that require real-time guarantees, energy efficiency, and high performance. To exploit the capability of the multi-core platform, considering intra-task parallelism, where an individual task can utilize multiple cores simultaneously, is vital. Intra-task parallelism facilitates a balanced distribution of the tasks among the processors, which leads to energy efficiency [1]. One of the most general-ized workload model for representing deterministic intra-task parallelism is Directed Acyclic Graph (DAG) task [2]. Under such method of abstraction, the nodes in a DAG represent various threads of execution while the edges represent their dependencies. In the past five years or so, quite some effort has been spent on developing real-time scheduling strategies and schedulability analysis of DAG tasks [2]–[9].