Show Me What and Tell Me How: Video Synthesis via Multimodal Conditioning | IEEE Conference Publication | IEEE Xplore