Reinforcement learning of LQR control policy by a double inverted-pendulum biomechanical model | IEEE Conference Publication | IEEE Xplore