Multimodal Vision Transformers with Forced Attention for Behavior Analysis | IEEE Conference Publication | IEEE Xplore