Abstract:In the research of face anti-spoofing (FAS), most related techniques are dependent to the RGB images or IR images, which lack sufficient biometric features and are vulnerable to ever-advancing presentation attacks. In this paper, a Transformer model based on combination of multiple facial regions is proposed to introduce multi-spectral technology into the task of facial live detection, aiming to obtain unique biological features of the real faces and increase the distinguishability from the fake faces. In the proposed model, multispectral images are utilized to broaden the spectral dimension for more reflection information, which can identify various materials. Besides, a spectral normalization method is preprocessed pixel by pixel to reduce the impacts of the environmental illumination variations and enhance the consistency of facial reflection features regionally. Then multiple core facial regions, like eyes, nose, mouth and cheeks, are selected as input of the deep learning model. Furthermore, a Transformer-based model is constructed to obtain both local regional features and inter association features of different facial regions, which are integrated into complete facial biometric features to achieve facial live detection. On the author’s self-built multi-spectral facial datasets, the results show that the proposed method achieved an accuracy of 95.72% for and a misclassification of 5.10% for live detection, which is superior to commonly used FAS models.