I. Introduction
IN recent years, workflow technologies are propitious to centralized information management and they automatically process the scientific and industrial applications from various fields, including medicine, education and finance, etc. [1]. Benefit from the efficiency and convenience of workflow technologies, a vast amount of applications modeled by the workflows are tremendously evolved, and the terabytes or petabytes of data resources are required by workflow implementation [2]. The execution performance of the workflow applications depends on the processing of the tasks as well as adequate supply of data resources to a great extent [3], [4]. In the big data era, the workflow applications become more complicate due to the exponentially growing scale of required datasets, the increasing amount of data sources and the complicated executive components of the applications. The execution performance of the workflow applications is seriously affected in the event of the insufficient supplement of processing and storage resources. Accordingly, a large pool of storage and processing resources offered by the cloud infrastructure is engaged to be provisioned for implementing the workflow applications [3], [5].