I. Introduction
As a visual language widely used in the world, sign language is mainly based on the hand movement of the sender, with auxiliary features such as facial expression and lip changes to convey semantic information, which is transmitted from the visual nerve of the recipient to the brain for information processing and analysis. In 2015, Tubaiz et al. used data gloves to recognize 40 statements containing 80 words in Arabic Sign Language [1] and combined with an optical camera to segment and align hand movements with their corresponding sign language words. Finally, the improved K nearest neighbor algorithm is used for classification, and the sentence recognition rate can reach 98.9%. Wu et al. [2] proposed a wearable real-time American Sign Language(ASL) recognition system in 2016, which detected and captured hand and arm movements through inertial measurement unit and surface emulsion signals, and recognized 80 isolated ASL words frequently used in daily conversations. Four different machine learning methods were used to classify them. Finally, the classification method based on support vector machine achieved the best results, and the recognition results of the same volunteer and different volunteers reached 96.16% and 85.24% respectively. Koller et al. [3] proposed a statistical method for German sign language recognition in 2015, which studied the following five aspects: sign language track tracking, sign language characteristics, dependence with signers, visual modeling and language modeling. SIGNUM and RWTH-Phoenix-Weather are two large open source lexical sign language datasets, which contribute to the development of sign language recognition.