I. INTRODUCTION
The ability to discriminate whether two stimuli are simultaneous is important to determine whether stimuli should be bound together and form a single multisensory perceptual object [1]. Not surprisingly, studying the ability of humans to reliably detect asynchronies and discriminate the temporal order of two stimuli is among the oldest questions in experimental psychology (i.e. [2]). In the seminal work of Hirsh and Sherrick on simultaneity discrimination [3], well-trained participants were presented with simple stimulus pairs composed of audio–visual, visual–tactile, and audio–tactile stimuli and could reliably report stimuli order with about 20ms asynchrony irrespective of the modalities presented. More recent studies suggest that non-experts might not be able to detect such small asynchronies and there might be large differences in performance across the population. For example, it has been shown that naïve participants could only detect asynchronies between a short light and a vibration starting from 35–65ms [4][5]. The stimuli used in these experiments were not coupled with the participant’s motion. In a study where participants used a force-feedback joystick to make a cursor hit a line and judged if the collision was simultaneous with the onset of the force produced by the joystick, the threshold for simultaneity perception was 59ms when force came first and 44ms when the cursor hit the line first [6]. In a study employing a touchscreen, it was determined that to ensure for users to perceive feedback to be simultaneous with their touch, haptic signal latency should be at most 50ms, audio latency 70ms, and visual latency 85ms [7].