I. Introduction
It is a well known fact that the process of finding the optimal solution to large scale Markov Decision Processes (MDP) is generally very demanding [9]. Approximate Dynamic Programming (ADP) algorithms [11] offer a plausible alternative of finding approximate solutions to MDP problems that would otherwise be intractable. One appealing ADP technique consists of incorporating a function approximation scheme into the problem in order to artificially reduce its complexity and searching for an approximate solution in a lower dimensional space [3], [11], [12]. Even though this approach has proved successful in practical applications, e.g. [14], there exist divergent counter-examples that make a strong case against the robustness of the technique [4].