Mingling or Misalignment? Temporal Shift for Speech Emotion Recognition with Pre-Trained Representations | IEEE Conference Publication | IEEE Xplore