Multi-Modality Speech Recognition Driven by Background Visual Scenes | IEEE Conference Publication | IEEE Xplore