Face, Body, Voice: Video Person-Clustering with Multiple Modalities | IEEE Conference Publication | IEEE Xplore