Learning Multi-Modal Class-Specific Tokens for Weakly Supervised Dense Object Localization | IEEE Conference Publication | IEEE Xplore