Short Video Representation Learning Based on Convolution Network with Text Attention Mechanism | IEEE Conference Publication | IEEE Xplore