Optimal Control of Discrete-Time Markov Jump Systems with Unknown System Dynamics: A Parallel Reinforcement Learning Scheme* | IEEE Conference Publication | IEEE Xplore