Speech Guided Masked Image Modeling for Visually Grounded Speech | IEEE Conference Publication | IEEE Xplore