Audio-Visual Person Verification Based on Recursive Fusion of Joint Cross-Attention | IEEE Conference Publication | IEEE Xplore