Transformer Based Multimodal Speech Emotion Recognition with Improved Neural Networks | IEEE Conference Publication | IEEE Xplore