Learning to Recognise Words Using Visually Grounded Speech | IEEE Conference Publication | IEEE Xplore