Vision-and-Language Navigation Based on Cross-Modal Feature Fusion in Indoor Environment | IEEE Journals & Magazine | IEEE Xplore