M3: Multimodal Memory Modelling for Video Captioning | IEEE Conference Publication | IEEE Xplore