I. Introduction
In robot simulation, the acceleration of the robot system must be computed for any applied force (forward dynamics, or FD) [1]. Albeit being a thoroughly explored subject, fast FD computation remains a relevant issue in emerging subfields of robotics. For example, while hyper-redundant robots for dexterous manipulation provide opportunities for working in complex environment [2], their large degree of freedoms (DOF) poses new challenges to fast simulation. In addition, given a kinematic reference trajectory of a robot, we can generate dynamically feasible trajectories using the concept of dynamics filter proposed in [3], by largely evaluating dynamics with sampled states around the reference and different control inputs. An optimal trajectory can then be selected [4]. Such applications involving a vast number of FD calculations also require high computational efficiency and motivate our current research on accelerating FD computation using CPU-GPU platform.