Robust Cross-Modal Representation Learning with Progressive Self-Distillation | IEEE Conference Publication | IEEE Xplore