Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training | IEEE Conference Publication | IEEE Xplore