Multimodal Speaker Recognition: Combining FFT, CNN, Speech-to-Text, BERT-Based Punctuation Restoration and Sentence Correction | IEEE Conference Publication | IEEE Xplore