Attn-VAE-GAN:Text-Driven High-Fidelity Image Generation Model with Deep Fusion of Self-Attention and Variational Autoencoder | IEEE Conference Publication | IEEE Xplore