I. Introduction
In recent years, face biometrics has become widely used and applied in many scenarios, such as unlocking mobile phones and automated border control (ABC). As a result, face recognition (FR) research [33], [4], [5] has shown significant progress over the past decade. Meanwhile, face presentation attack detection (PAD) has also attracted more and more attention due to the wide application of FR systems. PAD aims at securing the FR systems from presentation attacks (PAs), such as printed photos and replayed videos. Attackers can use such PAs to spoof FR systems by impersonating someone or obfuscating their identity. Many works leverage deep learning techniques and made a remarkable improvement in FR and PAD problems. However, the recent COVID-19 pandemic rendered the conventional FR and PAD solutions less effective in many cases as face masks present FR/PAD algorithms with unexpected face presentation. Damer et al. [11], [9] studied the effect of face mask on the performance of FR verification. Their experimental results have shown that FR algorithms designed before the COVID-19 pandemic suffer performance degradation owing to the masked faces. A follow up study showed that this effect extends even to verification decisions made by human operators [8]. Subsequently, many methods have been developed to target the masked FR problem. For example, several works proposed to train FR models by adding masked face data or simulated masked faces [7], [1], [14] or train models to focus on the unmasked regions [27], [31]. Moreover, Boutros et al. [6] proposed embedding unmasking model (EUM) operated on the top of existing face recognition models and trained using the self-restrained triplet loss function to enable the EUM to produce embeddings similar to these of unmasked faces. Their proposed method reduced the negative impact of wearing face masks on FR performance. Despite much attention paid to the masked FR problem, masked PAD is still understudied. Fang [17] et al. presented a collaborative real mask attack (CRMA) database containing three types of PAs, the unmasked print/replay attack (AMO), masked print/replay attack (AM1), and partially masked attack where spoof faces are partially covered by real masks (AM2). Figure 1 shows samples of the CRMA database. They conducted extensive experiments to explore the effect of masked bona fide, masked attacks, and partially masked attacks on the face PAD behavior. Their experimental results indicated that masked bona fide and PAs dramatically decreased the performance of PAD algorithms. Furthermore, they showed that deep-learning-based methods performed worse on the partially masked attack (AM2) than the masked attack (AM1) in most cases. Nevertheless, in their work [17], only the effect of masked PAs on the PAD and FR performance was investigated by utilizing several PAD algorithms designed before the COVID-19 pandemic, no solutions were proposed to target the challenges raised by the masked faces. Therefore, to address the issue of masked face PAD, especially partially masked attacks, we introduce a solution that combines two novel modules, partial attack label (PAL) and regional weighted inference (RW). The PAL module is inspired by the pixel-wise supervision [21], [25], [28]. However, unlike using a coherent map as the ground truth of partial attack (AM2) in [21], we propose annotating the partially covered real mask region as bona fide. The fine-grained partial attack label aims to enable better supervision during model training. Once the model is trained, the RW is used in the inference phase for further PAD decision optimization. The regional weighted inference is inspired by previous observations in [17], [18], [20] stating that the eye region contributes more significantly in different face-related tasks, such as PAD [17] or face image quality assessment [18], [19]. Based on this assumption, we weigh different regions of the predicted feature map and thus enhance the performance of the PAD decision.