Journals & Magazines >IEEE Transactions on Biometri... >Volume: 4 Issue: 4

Attack Detection for Finger and Palm Vein Biometrics by Fusion of Multiple Recognition Algorithms

Abstract:

Vascular patterns in the hand region are not visible to the naked eye or consumer cameras, therefore finger and hand vein biometrics is often considered invulnerable to p...Show More

Metadata

Abstract:

Vascular patterns in the hand region are not visible to the naked eye or consumer cameras, therefore finger and hand vein biometrics is often considered invulnerable to presentation attacks. However, one can never rule out the possibility that a malicious attacker manages to create functional attack samples. Various approaches on how to detect such attacks have been proposed, along with publicly available attack databases and varying ideas to create artificial attack samples. In a first step it is important to verify that created presentation attack artifacts hold the potential to deceive a real system. In order to provide a meaningful and comparable threat potential evaluation, this article evaluates 15 existing vein recognition schemes using attack samples derived from three finger vein attack databases and one palm vein attack database. As a second step, in this work we investigate an approach to combine these employed vein recognition schemes and utilizing them to perform presentation attack detection, which to the authors’ best knowledge has not been described in literature so far. Experimental results show that this approach can effectively be used to detect vein attack samples.

Published in: IEEE Transactions on Biometrics, Behavior, and Identity Science ( Volume: 4, Issue: 4, October 2022)

Page(s): 544 - 555

Date of Publication: 10 October 2022

Electronic ISSN: 2637-6407

DOI: 10.1109/TBIOM.2022.3212836

Funding Agency:

Contents

CCBY - IEEE is not the copyright holder of this material. Please follow the instructions via https://creativecommons.org/licenses/by/4.0/ to obtain full-text articles and stipulations in the API documentation.

SECTION I.

Introduction

Vein biometrics, as it is commonly referred to, describes the task of either identifying a person or verify a persons claim of identity using a scan of their vascular pattern. The two most popular regions used in the context of vascular biometrics are the eye and hand regions [1]. According to [2], vein patterns from the following four locations are eligible options for hand-based vascular biometrics: Finger vein, palm vein, hand vein and wrist vein.

Since blood vessels are mostly hidden below the skin, and therefore hard to detect with the human visual system or consumer cameras, additional hardware is needed to make them visible. A relatively low-cost solution to do so is the usage of light in the near infrared region together with an infrared-sensitive imaging sensor. Since the hemoglobin in the blood absorbs light in this wavelength spectrum while surrounding tissues do not, the blood vessels appear as dark structures on the acquired images.

Despite the demand for additional hardware, vascular biometrics offers some advantages over more traditional ways of authentication in terms of security: Access tokens (such as keys, cards or badges) can be stolen, sweat on the fingertip leaves behind a latent fingerprint that can be reconstructed and the face is always exposed which makes it easy to capture. Therefore one would assume that vein biometrics is secure against attempted attacks, however german hackers [3] demonstrated at a hacking conference, that under the right circumstances hand vein structures can be captured even from a few meters distance. In addition to that they suggested that hand dryers in restrooms could potentially be manipulated to capture vein structures in the hand region. It has been shown in past research [4], [5] that without countermeasures, a simple print (e.g., using a laser printer) of such maliciously captured vein structures would suffice to fool a vein recognition system. Attacks of such kind are known as presentation attacks (PA). The ISO/IEC [6] defines presentation attacks as a presentation to the biometric data capture subsystem with the goal of interfering with the operation of the biometric system. This definition includes the intent to impersonate someone as well as the intent to conceal ones identity.

To counter such attacks, numerous research has been carried out over the last decade. A comprehensive overview of such is given in [7, Table 14.1], including hand crafted methods that employ image quality, generic texture and spatial frequency components as well as convolutional neural network based methods.

To further motivate research in this field and also to enable comparable and transparent results, publicly available attack databases, containing attack artefacts, are needed.

Currently, three finger vein attack databases and one palm vein attack database exist: The Idiap Research Institute VERA Fingervein Database (VeraFV) [8], the South China University of Technology Finger Vein Database (SCUT) [9], the Paris Lodron University of Salzburg Finger Vein Spoofing Data Set (PLUS) [10] and the The Idiap Research Institute VERA Spoofing PalmVein Database (VeraPV) [11]. All four are included in this work and therefore described in detail in Section II.

However, to actually demonstrate the effectiveness of created attack samples, only limited results are available in the literature. For doing this, one can utilize a so called two-scenario-protocol or two-step-protocol, as described in Section III-A1. The outcome of this threat evaluation protocol is the percentage of attack samples that would be accepted by a given vein recognition system. For the VeraFV database, one recent publication [12] tests three vein recognition schemes (Wide Line Detector [13], Repeated Line Tracking [14] and Maximum Curvature [15]) using the two-step-protocol. In case of the SCUT, a publication [16] reports a pass rate, defined as successful attacks divided by the total attacks using a recognition method that is not further described. The publication that introduced the PLUS database contains a vulnerability analysis that employs twelve different finger vein recognition schemes in order to evaluate how hazardous the presentation attacks are from a broader perspective. The VeraPV database was evaluated on its introduction using the two-step-protocol on a local binary pattern [17] based feature extraction together with histogram intersection as biometric comparison technique.

Since every database was introduced with its own study to prove the actual success of created attacks, often with limited variability of confronted recognition schemes, this study aims to measure the threat potential in a comparable manor. To do so, an extensive collection of varying vein recognition schemes are employed to test all publicly available attack databases to perform a comprehensive threat evaluation.

A common strategy to increase vein recognition performance is to combine various recognition schemes that add complementary information (e.g., in [18]). Analogously, merging multiple attack detection schemes is used to improve the success rate of a single attack detection scheme, like in [19]. However, to combine the output of several recognition schemes in order to perform attack detection, is unexplored for the domain of unimodal hand-based vascular biometrics. Therefore, in this work, as a second step to the threat evaluation, biometric comparison scores from the threat assessment are utilized in order to perform presentation attack detection.

The content of this study constitutes an extension to the authors’ previous work [10], [20], [21] and can summarized as follows:

First, an extensive threat evaluation is carried out for all aforementioned attack databases. The goal is to measure the potential to deceive a real system by confronting 15 different vein recognition schemes that can be categorized into four classes of algorithms based on what type of feature they consider for classification.
Second, it is tested whether multiple similarity scores from vein recognition schemes generated in the first part could be fusioned in order to achieve presentation attack detection. To do so, exhaustive cross combination is carried out employing seven different fusion strategies together with six feature scaling approaches (including the possibility to omit feature scaling).

This article introduces the transfer of the threat evaluation from the finger vein databases assessed in our earlier conference publication [20] to the publicly available palm vein attack database mentioned earlier. In addition to that, a second convolutional neural network is added for the experiments in this article that employs a DenseNet architecture together with a softmax loss. Further differences to the reference work [20] include: two minutiae-based vein recognition schemes that constitute a fourth feature extraction category, two additional score-level-fusion strategies and three extra feature scaling approaches.

The remainder of this work is structured as follows: Section II describes all databases used throughout the experiments in this article. Section III explains the experimental setup for both, the threat evaluation as well as the presentation attack detection experiments. Section IV includes the experimental results and finally, Section V provides a summary of this work.

SECTION II.

Databases

Four databases are included in this work which are presented in the following:

Paris Lodron University of Salzburg Palmar Finger Vein Spoofing Data Set (PLUS)1: The PLUS-FV3 Spoof data set uses a subset of the PLUS Vein-FV3 [22] database as bona fide samples. For the collection of presentation attack artefacts, binarized vein images from 6 fingers (i.e., index, middle and ring finger of both hands) of 22 subjects were printed on paper and sandwiched into a top and bottom made of beeswax. The binarization was accomplished by applying Principal Curvature [23] feature extraction in two different levels of vessel thickness, named thick and thin. The original database was captured with two types of light sources, namely LED and Laser. Therefore, presentation attacks were created for both illumination variants. While the original database was captured in 5 sessions per finger, only three of those were reused for presentation attack generation. Summarized, a total of 396 (22*6*3) presentation attacks per light source (LED & Laser) and vein thickness (thick & thin) with corresponding to 660 (22*6*5) bona fide samples are available. Every sample is of size $192\times736$ .
The Idiap Research Institute VERA Fingervein Database (IDIAP VeraFV)2: The IDIAP VERA finger vein database consists of 440 bona fide images that correspond to 2 acquisition sessions of left and right hand index fingers of 110 subjects. Therefore these are considered as 220 unique fingers captured 2 times each. Every sample has one presentation attack counterpart. Presentation attacks are generated by printing preprocessed samples on high quality paper using a laser printer and enhancing vein contours with a black whiteboard marker afterwards. Every sample is provided in two modes named full and cropped. While the full set is comprised of the raw images captured with size $250\times665$ , the cropped images were generated by removing a 50pixel margin from the border, resulting in images of size $150\times565$ .
South China University of Technology Spoofing Finger Vein Database (SCUT-SFVD)3: The SCUT-SFVD database was collected from 6 fingers (i.e., index, middle and ring finger of both hands) of 100 persons captured in 6 acquisition sessions, making a total of 3600 bona fide samples. For presentation attack generation, each finger vein image is printed on two overhead projector films which are aligned and stacked. In order to reduce overexposure, additionally a strong white paper ( $200g/m^{2}$ ) is put in-between the two overhead projector films. Similar to the IDIAP VERA database, the SCUT-SFVD is provided in two modes named full and roi. While in the full set every image sample has a resolution of $640\times288$ pixel, the samples from the roi set are of variable size. Since the LBP and the ASAVE matching algorithm can not be evaluated on variable sized image samples, a third set was generated for this study named roi-resized where all roi samples have been resized to $474\times156$ which corresponds to the median of all heights and widths from the roi set.
The Idiap Research Institute VERA Spoofing Palmvein Database (IDIAP VeraPV)4: The bona fide samples used in the IDIAP Vera Spoofing Palmvein database descend from a palmvein database that was introduced at the same time. The palmvein database consists of 2200 images from 110 subjects where both hands were captured five times in two sessions. Attacks were created for all samples from the first 50 subjects. For the creation of the attack samples, one of the ten available samples per hand was chosen and after preprocessing is printed on 100g paper using a commercial ink printer. Every sample is available in a raw or full version and a region of interest (roi) version. The resolution of the full samples is $480\times640$ pixel. Similar to the SCUT-SFVD database, the roi samples are of variable size and are therefore resized to the median of all heights and widths from the roi image set for the experiments in this study: $273\times322$ .

Fig. 1 shows an exemplary pair of images from each database. While the samples from the full versions of the databases often also include the contour of the finger or hand as well as some background, the cropped or roi versions tend to only show the parts that include vein structures. Since the process for creation of the SCUT, VeraFV and VeraPV databases includes a scan of a printed sample, the resulting presentation attacks show certain similarities to their bona fide counterparts. The PLUS attack samples, however, can potentially be distinguished from the bona fide samples even with the naked eye, which is owed to the process of creating these attack samples.

Fig. 1.

One exemplary pair of images (upper row: bona fide; lower row: presentation attacks) for every database used in this work.

Show All

SECTION III.

Experimental Setup

In the following section the different experimental setups are described. First, the setup regarding the vulnerability assessment is elaborated, followed by a detailed description of the presentation attack detection experiments.

A. Threat Evaluation Setup

In order to get a broader view on what attack creation recipes are a threat to what systems, various feature extraction and comparison schemes are confronted with the attack databases. In total, 15 different biometric vein recognition schemes are employed for the experiments. They can be categorized into four classes of algorithms based on the feature type extracted from the vein samples:

Binarized vessel networks: Algorithms from this category work by transforming a raw vein image into a binary image where the background (and also other parts of the human body such as flesh) is removed and only the extracted vessel structures remain. The binarized image is then used as a feature image for the comparison step. Seven of such approaches are included in this work. Maximum Curvature (MC) [15] and Repeated Line Tracking (RLT) [14] try to achieve this by looking at the cross sectional profile of the finger vein image. Other methods such as Wide Line Detector (WLD) [13], Gabor Filter (GF) [24] and Isotropic Undecimated Wavelet Transform (IUWT) [25] also consider local neighbourhood regions by using filter convolution. A slightly different approach is given by Principal Curvature (PC) [23] which first computes the normalized gradient field and then looks at the eigenvalues of the Hessian matrix at each pixel. All so far described binary image extraction methods use a correlation measure to compare probe and template samples which is often referred to as Miura-matching due to its introduction in Miura et al. [14]. One more sophisticated vein pattern based feature extraction and matching strategy is Anatomy Structure Analysis-Based Vein Extraction (ASAVE) [26], which includes two different techniques for binary vessel structure extraction as well as a custom matching strategy.
Keypoints: The term keypoint is generally understood as a specific pixel or pixel region in an digital image that provides some interesting information to a given application. Every keypoint is stored by describing its local neighbourhood and its location. This research uses three keypoint based feature extraction and matching schemes. One such keypoint detection method, known as Deformation Tolerant Feature Point Matching (DTFPM) [27], was especially tailored for the task of finger vein recognition. This is achieved by considering shapes that are common in finger vein structures. Additionally, modified versions of general purpose keypoint detection and matching schemes, SIFT and SURF, as described in [18] are tested in this research. The modification includes filtering such that only keypoints inside the finger are used while keypoints at the finger contours or even in the background are discarded.
Texture information: Image texture is a feature that describes the structure of an image. Shapiro and Stockman [28] define image texture as something that gives information about the spatial arrangement of color or intensities in an image or selected region of an image. While two images can be identical in terms of their histograms, they can be very different when looking at their spatial arrangement of bright and dark pixels. Three methods are included in this work that can be counted to texture-based approaches. One of which is a Local Binary Pattern (LBP) [29] descriptor that uses the output of Gabor filters in various scales and orientations as an input. The resulting LBP are block-wise transformed into concatenated histograms. For comparison, histogram intersection is used as a similarity metric. The other two methods are convolutional neural network (CNN) based approaches. The first CNN approach (CNN-S) uses a Squeeze-Net architecture together with triplet loss function [30]. The second recognition approach employs a DenseNet-161 network architecture with softmax loss function [31] (CNN-D). Similarity scores for the CNN approaches are obtained by computing the inverse Euclidean distance given two feature vectors corresponding to two finger vein samples.
Minutiae-based: The term minutiae descends from the domain of fingerprint biometrics. Every fingerprint is a unique pattern that consists of ridges and valleys. The locations where such a pattern has discontinuities such as ridge endings or bifurcations are named “minutiae points”. This concept of finding such minutiae points was successfully transferred [32] to the vein biometrics domain by skeletonization of an extracted binarized vein image, such as described in the first category. This study now employs two schemes in order to perform vein recognition, that both use these extracted minutiae points. First, the proprietary software VeriFinger SDK (VF)5 is used for comparison of these minutiae points. A second method, Location-based Spectral Minutiae Representation (SML) [33] uses the minutiae points as an input in order to generate a representation that can finally be compared using a correlation measure.

Fig. 2 shows the extracted features from one exemplary vein sample, stemming from the PLUS database. The ASAVE method extracts two feature representations, coined vein network and vein backbone, which are overlaid using white and gray, respectively, in order to fit the illustration. The LBP illustration shows four Gabor filter responses after applying the LBP algorithm. Since the output of the CNN based approaches is a vector embedding, such a visualization is not applicable. Acting as a representative for both minutiae schemes, a skeletonized finger vein structure is shown, where the minutiae point locations are marked with circles. Since the minutiae schemes are based on a binarized vessel network, their solidity depends on the robustness of the underlying procedure for extracting the vessel network.

Fig. 2.

Exemplary depiction of extracted features from every category. Since the output of the CNN based approaches is a vector embedding, such a visualization is not applicable.

Show All

Every vein recognition scheme has its own set of hyper-parameters. For the experiments in this work, the same settings were applied to every vein database.

The minutiae-based recognition schemes are evaluated using the VeinPLUS + Framework [34]. The binary vessel network based methods, the keypoint based methods and the LBP scheme are evaluated using the PLUS OpenVein Toolkit [35]. The CNN approaches were implemented in Python. The network training for the PLUS databases was done using finger vein images from the PROTECT [36] open source data set, since these descend from a similar imaging sensor. The training for the VeraPV was achieved using the remaining bona fide subjects 51–110 that have no presentation attack counterpart. Network training for the remaining databases was done using 2-fold-cross validation. Due to the fact that such network training is non-deterministic, the EERs and IAPMRs for the CNN methods are calculated by taking the arithmetic mean over both folds.

For comparison the ‘FVC’ protocol, named after the Fingerprint Verification Contest 2004 [37] is used. With this protocol all samples are compared against all remaining samples descending from the same subject as genuine comparisons as given by equation (1). The number of genuine comparisons is denoted as $n_{genuine}$ , the number of impostor comparisons as $n_{impostor}$ and the number of subjects as $n_{subjects}$ . For the case of finger vein data every individual finger is treated as unique subject, while for the palm vein data the term subject refers to a whole hand. For the formulas in this section, is assumed that every subject has the same number of samples $n_{samples}$ .

$\begin{equation*} n_{genuine} = \frac {n_{samples}*\left ({n_{samples} - 1 }\right)}{2}* n_{subjects}\tag{1}\end{equation*}$ View Source

For the Impostor comparisons, the FVC protocol defines that only the first sample of every subject is compared against the first of the remaining subjects. The number of impostor comparisons is given by equation (2) and only depends on the number of unique subjects regardless of the number of samples per subject.

$\begin{equation*} n_{impostor} = \frac {n_{subjects}*\left ({n_{subjects} - 1 }\right)}{2}\tag{2}\end{equation*}$ View Source

Hence, it is ensured that every genuine comparison is made, but largely reduces the amount of impostor scores. Symmetric comparisons are omitted. It is important to note however that the correlation based comparison algorithm used for the binarized vessel network based approaches (Miura matching) is not symmetric.

1) Threat Evaluation Protocol:

Biometric template comparison systems, which are basically binary classifiers, produce four types of outcome: (i) correct match (True Positive, TP), (ii) wrong rejection (False Negative, FN; often called type II error), (iii) correct rejection (True Negative, TN) and (iv) wrong match (False Positive, FP; often called type I error). The decision whether a comparison is considered to be a match or a reject is based on a decision threshold $\tau$ . Through varying the decision threshold, the False Match Rate (FMR, i.e., the ratio of wrongly accepted impostor attempts to the number of total impostor attempts) and the False-Non Match Rate (FNMR, i.e., the ratio of wrongly denied genuine attempts to the total number of genuine verification attempts) can be determined as seen in equations (3) and (4).

$\begin{align*} \textrm {FMR}(\tau) = \frac {FP}{FP\,+\,TN}\tag{3}\\ \textrm {FNMR}(\tau) = \frac {FN}{FN\,+\,TP}\tag{4}\end{align*}$ View Source

The operating point where FMR = FNMR is called Equal Error Rate (EER) and its corresponding decision threshold value denoted as $\tau _{EER}$ .

A common way to evaluate the level of threat exhibited by a certain database is to use a so called two-scenario or two-step protocol [4], [5], [38]. The two steps are briefly summarized hereafter:

Scenario 1, Normal Mode: The first scenario employs two types of users: Genuine (positives) and zero effort impostors (negatives). Therefore, both enrollment and verification is accomplished using bona fide finger vein samples. Through calculation of the FMR and FNMR, the EER can be computed and the decision threshold set at $\tau _{EER}$ . The normal mode can be understood as a biometric comparison evaluation experiment which has the goal to determine an operating point $\tau _{EER}$ for the second scenario.
Scenario 2, Attack Mode: The second scenario uses genuine (positives) and presentation attack (negatives) users. Similar to the first scenario, enrollment is accomplished using bona fide samples. This time, however, verification attempts are performed by comparing presentation attack samples against their corresponding genuine enrollment templates. Given the threshold determined in step 1, the proportion of wrongly matched presentation attacks is then reported as the Impostor Attack Presentation Match Rate (IAPMR), as defined in the ISO/IEC 30107-3:2017 [6].

B. Presentation Attack Detection Setup

Throughout the following section, genuine scores (i.e., scores that descend from intra-subject comparisons) are viewed as bona fide scores and scores that descend from comparisons where an attack sample is compared to its bona fide counterpart are considered to be presentation attack scores.

Solving the presentation attack problem via combination of multiple similarity scores from different recognition schemes can be formally written as follows. Let $s_{ij}$ be the $i$ th comparison score out of $m$ total eligible comparisons that was assigned by the $j$ th recognition scheme out of $n$ considered recognition schemes at a time, i.e., $i \in \{1,\ldots, m\}$ and $j \in \{1,\ldots, n\}$ . Note that for an $i$ th comparison, every recognition scheme needs to compare the same biometric samples. Further, let $x_{i}$ be the vector $x_{i}\,=\,(s_{i1},\ldots, s_{in})$ that describes the $i$ th comparison by concatenation of all $n$ recognition schemes considered at a time. Let $\mathrm {\textbf {X}}$ be the set of all $m$ eligible comparisons from a certain database $\mathrm {\textbf {X}} = \{x_{1},\ldots, x_{m}\}$ . Since some of the following fusion strategies demand for training data, the set of all comparisons $\mathrm {\textbf {X}}$ is always divided into a train ${\mathbf {{X_{train}}}}$ and a test split ${\mathbf {{X_{test}}}}$ using 2-fold cross validation. The combined score that remains after score level fusion for the $i$ th comparison shall be denoted as $S_{i}$ . The comparison can then be classified as bona fide or attack comparison by simple thresholding of $S_{i}$ .

In total, seven score fusion strategies and six score normalization (including the possibility to omit normalization) approaches are included in the experiments. An illustration of this process can be seen in Figure 3.

Fig. 3.

Block diagram illustration of the presentation attack detection scheme evaluated in this article.

Show All

1) Fusion Strategies:

Three rather simple fusion strategies (Min-Rule Fusion, Max-Rule Fusion and Simple Sum-Rule Fusion) were adopted from [39] that are formally defined in equations (5)–(7).

Min-Rule Fusion
$\begin{equation*} S_{i} = \min \left ({x_{i} }\right)\tag{5}\end{equation*}$ View Source
Max-Rule Fusion
$\begin{equation*} S_{i} = \max \left ({x_{i} }\right)\tag{6}\end{equation*}$ View Source
Simple Sum-Rule Fusion
$\begin{equation*} S_{i} = \sum _{j=1}^{n} s_{ij}\tag{7}\end{equation*}$ View Source

A more sophisticated method, Weighted Sum-Rule Fusion, is defined in equation (8). Here every similarity score is multiplied by a weighting constant. The weights remain constant for all subjects, thus also known as “Matcher Weighting” [40]. All weights together form a convex combination, i.e., are non negative and add up to one.

Weighted Sum-Rule Fusion
$\begin{equation*} S_{i} = \sum _{j=1}^{n} s_{ij} * w_{j} \tag{8}\end{equation*}$ View Source

To use the weighted sum rule fusion, a strategy has to be chosen on how to assign the weights $w_{j}$ . Snelick et al. [40] propose to choose weights indirect proportional to the error rates. Hence, the training split ${\mathbf {{X_{train}}}}$ is used to calculate an Equal Error Rate estimate $EER_{j}$ per recognition scheme $j$ . This decision can be justified since ${\mathbf {{X_{train}}}}$ consists of bona fide scores and presentation attack scores, therefore giving a rough estimate on how much a particular recognition scheme contributes to the PAD-task. The formula in equation (9) depicts the calculation of $w_{j}$ .

$\begin{equation*} w_{j} = \frac {\frac {1}{EER_{j}}}{\sum _{v=1}^{n} \frac {1}{EER_{v}} }\tag{9}\end{equation*}$ View Source

In addition to the four fusion strategies above, three learning based classifiers are included in this study. The classifiers that are trained using ${\mathbf {{X_{train}}}}$ in order to predict the test data ${\mathbf {{X_{test}}}}$ per fold: (i) Linear Discriminant Analysis, (ii) Support Vector Machine with Linear Kernel and (iii) Support Vector Machine with Radial Basis Function Kernel.

2) Score Normalization:

Because similarity scores from different recognition algorithms often lie in different value ranges, also some score normalization techniques are included in this study. Let $x_{i}^{\prime }$ denote the vector $x_{i}$ after normalization was applied. Calculations over the whole data (e.g., $\mu$ , $\min$ , $\max ,\ldots$ ,) are also conducted over the training split ${\mathbf {{X_{train}}}}$ for the normalization step in order to simulate a realistic scenario where the sample under test is not included in parameter determination. Within this section, $\sigma$ represents the operator for calculation of the standard deviation and $\mu$ stands for calculation of the arithmetic mean. The trivial “technique” used in this study is to omit score normalization.

No-norm
$\begin{equation*} x_{i}^{\prime } = x_{i}\tag{10}\end{equation*}$ View Source

Further, Three rather popular score normalization techniques [41] (Min-Max Norm, Z-Score Norm and Tanh-Norm) are tested that are defined as seen in equations (11)–(13).

Min-Max Norm
$\begin{equation*} x_{i}^{\prime } = \frac {x_{i} - \min \left ({{\mathbf {{X_{train}}}}}\right)}{\max \left ({{\mathbf {{X_{train}}}}}\right) - \min \left ({{\mathbf {{X_{train}}}}}\right)}\tag{11}\end{equation*}$ View Source
Tanh-Norm
$\begin{equation*} x_{i}^{\prime } = 0.5 * \left ({\tanh \left ({0.01 * \frac {x_{i} - \mu \left ({{\mathbf {{X_{train}}}}}\right)}{\sigma \left ({{\mathbf {{X_{train}}}}}\right)} }\right) + 1 }\right)\tag{12}\end{equation*}$ View Source
Z-Score Norm
$\begin{equation*} x_{i}^{\prime } = \frac {x_{i} - \mu \left ({{\mathbf {{X_{train}}}}}\right)}{\sigma \left ({{\mathbf {{X_{train}}}}}\right)}\tag{13}\end{equation*}$ View Source

Another normalization technique was proposed by He et al. [42] named Reduction of high-scores effect (RHE) normalization. Here, ${\mathbf {{X_{train_{bf}}}}}$ indicates to use only the bona fide comparison scores for calculation of the mean and standard deviation.

Rhe-Norm
$\begin{equation*} x_{i}^{\prime } = \frac {x_{i} - \min \left ({{\mathbf {{X_{train}}}}}\right)}{\mu \left ({{\mathbf {{X_{train_{bf}}}}}}\right) + \sigma \left ({{\mathbf {{X_{train_{bf}}}}}}\right) - \min \left ({{\mathbf {{X_{train}}}}}\right)}\tag{14}\end{equation*}$ View Source

Additionally, rescaling the feature vector $x$ to unit length is used in the experiments. Note that for the cases where only a single recognition scheme is considered, the Unit Length Norm would result in ones (i.e., ${}\frac {x}{||x||} = 1$ for the case where x is a scalar). Therefore the normalization is omitted for these cases.

Unit Length Norm
$\begin{equation*} x_{i}^{\prime } = \frac {x_{i}}{||x_{i}||}\tag{15}\end{equation*}$ View Source

3) Attack Detection Metrics:

To evaluate the effectiveness of the proposed presentation attack detection approach, results are reported in compliance with the ISO/IEC 30107-3:2017 [6]. Since a presentation attack detection mechanism is a binary classifier, the four outcomes analogously to those described in Section III-A1 are possible: correctly classified as attack (TP), wrongly classified as attack (FP), correctly classified as bona fide (TN) and wrongly classified as bona fide (FN). The standard defines the following two decision threshold $\tau$ dependent metrics for presentation attack detection: Attack Presentation Classification Error Rate (APCER) and Bona Fide Presentation Classification Error Rate (BPCER).

Attack Presentation Classification Error Rate (APCER): Proportion of attack presentations incorrectly classified as bona fide presentations in a specific scenario
$\begin{equation*} \textrm {APCER}(\tau) = \frac {FN}{FN+TP}\tag{16}\end{equation*}$ View Source
Bona Fide Presentation Classification Error Rate (BPCER): Proportion of bona fide presentations incorrectly classified as presentation attacks in a specific scenario
$\begin{equation*} \textrm {BPCER}(\tau) = \frac {FP}{FP+TN}\tag{17}\end{equation*}$ View Source

SECTION IV.

Experimental Results

Within this section, the experimental results for the threat evaluation and the presentation attack detection are described.

A. Threat Evaluation

The experiments carried out in this section aim to evaluate how harmful the attack recipes used for creation of the attack samples in the databases from Section II are. To do so, every database undergoes the threat evaluation protocol described in Section III-A1. The results from the threat evaluation experiments are listed in Table I. The dashed horizontal line is meant to separate finger vein and palm vein results. Note that for the PLUS LED and PLUS Laser databases, where different attacks descend from the same bona fide samples, the reported EER values for thin and thick are equal since the error rate calculation is solely based on bona fide data.

TABLE I Results Threat Evaluation Using 2-Scenario-Protocol in Terms of EER and IAPMR. The Horizontal Line Separates Finger Vein (Above Dashed Line) and Palm Vein (Below Dashed Line) Experiments. Values in %

Using the same hyper-parameters per recognition scheme for every database makes the results more comparable, while performing an exhaustive grid search for every experiment could potentially lead to improved EER values. Lower EER values imply a better vein recognition performance. High IAPMR values indicate that a high proportion of the presented attack artifacts are wrongfully accepted.

1) Finger Veins:

In general, binary vessel pattern based schemes tend to accept finger vein attack samples regardless of the used attack recipe. The highest IAPMR per database being 93.48% for the VeraFV, 86.33% for the SCUT and 89.54% for the PLUS databases. Effectively this means that in a real world scenario, without further attack detection measures, up to 9 out of 10 attack samples would be accepted. A change in comparison hyper parameter settings, namely allowing for vertical and horizontal translation during the “Miura matching” comparison step, reduced the EER values for the roi/cropped attack samples by up to 15%, compared to the reference work. As a consequence also the IAPMRs increased, now giving a clearer image on the threat that the roi samples pose to the binary vessel pattern based schemes. For the VeraFV database, authors from the same research institute that initially published the database, also reported in [12] EER and IAPMR values for three binarized vessel pattern based recognition algorithms. These can be used to verify that the hyper parameters used in the experiments in this work lead to comparable results within an acceptable range: MC (1-2% EER, 77-89% IAPMR), RLT (11-19% EER, 32-38% IAPMR) and WLD (3-7% EER, 70-80% IAPMR).

The general purpose keypoint scheme SIFT tends be resistant against the PLUS attacks, while the other two finger vein databases seem to pose a threat, having IAPMRs ranging from 14%-44%. The other general purpose keypoint recognition scheme, SURF, seems to be quite unimpressed by the presented attacks overall, reaching an overall high IAPMR of 14.24% at the VeraFV cropped attacks. While SIFT in general obtains a lower EER, SURF occurs to be less prone to presentation attacks. Similar to the SIFT recognition scheme, the texture based methods LBP and CNN tend to be only susceptible to the VeraFV and SCUT attack samples. Minutiae based methods as well as the DTFPM keypoint scheme show minor $(\approx 5$ %-16% IAPMR) receptiveness for the PLUS attacks and higher $(\approx 21$ % and above) for the VeraFV and SCUT attacks.

2) Palm Veins:

For the case of the VeraPV attacks, it can be observed that the raw images have much higher EERs compared to the roi version when evaluated with vein recognition schemes that consider the contents of the whole image (i.e., binary vessel pattern based or local binary pattern). Interestingly, although not unexpected, the general purpose keypoint based recognition schemes as well as the CNN based approaches have reasonably low EERs for the raw palm vein samples. This can be explained by the fact that the features extracted in those approaches only consider local neighbourhoods instead of the whole image which possibly contains a lot of background noise and bigger hand misplacements. Note that IAPMRs where the corresponding EERs are high, are not very meaningful. Further explanation of the results therefore only considers the roi version.

The IAPMR of binary vessel structure recognition schemes almost homogeneously fluctuate around 50% (42.31%-63.73%). One exception is the RLT approach, but with an EER of 15%, the IAPMR is not as significant. Interestingly, the same observation holds true for all the other recognition schemes considered in these experiments, ranging from 34.22% (SML) up to 60.84% (LBP). Even SURF, the general purpose keypoint scheme that is very resistant to the finger vein attacks, produces an IAPMR value of 35.22%. The IAPMRs indicate that roughly every second to third attack would be wrongly accepted as a bona fide presentation. Similar to the VeraFV, results for a similar threat evaluation experiment are available for one recognition scheme. The authors of the VeraPV measured the level of threat on its introduction using a local binary pattern scheme for feature extraction and histogram intersection for comparison, achieving $\approx 65$ %-75% IAPMR at 3.33 % EER. It can be concluded that, without further attack detection mechanism, the palm vein attacks would pose a threat to all of the considered recognition schemes.

B. Presentation Attack Detection Using Score Fusion

The PAD performance is reported in Table II in terms of detection equal error rate D-EER (operating point where BPCER = APCER), BPCER20 (BPCER at APCER $< =0.05$ ) and BPCER100 (BPCER at APCER $< =0.01$ ).

TABLE II Selection of Best Working Method Constellations in Terms of Detection Error Rate. Note That There are Often Multiple Eligible Options. Only the Constellation That Uses the Least Amount of Recognition Schemes as Possible. Values in %

For every database, an exhaustive cross combination including all the described fusion strategies, normalization techniques and recognition algorithms is applied. This results in 7 (fusion) * 6 (normalization) * $32\,\,767\,\,(2^{15}-1$ , all possible combinations of the 15 recognition algorithms) = 1 376 214 D-EER values per database. The results reported in Table II are the best working constellation for a given database in terms of D-EER, that is, if a single best constellation exists. For the case where multiple method constellations result in the same D-EER (especially the PLUS databases tend to have many solutions that would yield 0.00% D-EER, i.e., perfect separation), the method constellation is reported that uses as the least amount of feature extraction schemes as possible. The employed recognition schemes for the best working constellation are indicated using a check symbol. The fact that the PLUS databases are well separable also coincides with the observation from Section IV-A, where the PLUS samples were only a major threat to one category of recognition algorithms (the ones that extract binary images of vessel structures as feature images). This can be explained by the visual difference as seen in Fig. 1, which is due to the process used to create these kind of attack samples.

For further analyzing the contribution of every recognition scheme, fusion strategy and score normalization method, the following figures depict their relative occurrence in the best working constellations. For better visibility which methods provide the most valuable information, only those method constellations which yield D-EERs $< 5\%$ are considered. To keep the plots easy to interpret, the databases are split into PLUS databases (Figs. 6 and 7) and the rest (Figs. 4 & 5). It is important to note that for the PLUS databases, roughly 95% of all 1 376 214 method constellations would be included within the 5% D-EER threshold, resulting in mostly uniform distributions. Therefore, the threshold for the PLUS databases was reduced to D-EERs $< 0.5\%$ , where a subset of roughly 13% out of all method constellations remain.

$Fig. 4. - Influence of each recognition scheme for the best (D-EER $ < 5\%$ ) constellations.$

Fig. 4.

Influence of each recognition scheme for the best (D-EER $< 5\%$ ) constellations.

Show All

$Fig. 5. - Influence fusion strategies (left) and score normalization (right) schemes for the best (D-EER $ < 5\%$ ) constellations.$

Fig. 5.

Influence fusion strategies (left) and score normalization (right) schemes for the best (D-EER $< 5\%$ ) constellations.

Show All

$Fig. 6. - Influence of each recognition scheme for the best (D-EER $ < 0.5\%$ ) constellations.$

Fig. 6.

Influence of each recognition scheme for the best (D-EER $< 0.5\%$ ) constellations.

Show All

$Fig. 7. - Influence fusion strategies (left) and score normalization (right) schemes for the best (D-EER $ < 0.5\%$ ) constellations.$

Fig. 7.

Influence fusion strategies (left) and score normalization (right) schemes for the best (D-EER $< 0.5\%$ ) constellations.

Show All

It can be concluded that through combination of fundamentally different approaches, complementary information is available that finally leads to the ability to reasonably well detect presentation attacks. While perfect separation of bona fide and attack samples is not always achieved, using a combination of multiple recognition schemes for this task constitutes a method that has not been explored so far in the domain of hand-vascular biometrics. Of course, however, by doing so, the ability to detect presentation attacks is limited by the performance of the considered recognition schemes, which could potentially be improved by exhaustive hyper parameter optimization.

SECTION V.

Summary

The present article first analyzed public finger and palm vein attack databases on their potential to deceive state of the art vein recognition algorithms. As a second step, it was tested whether a fusion of recognition algorithms could be employed in order to achieve presentation attack detection.

The first part of this research employed a common threat evaluation protocol, known as 2-step-protocol, in order to make results more comparable. To get a broader perspective on the attack potential, 15 distinct vein recognition algorithms are confronted with the attacks that can be categorized into four meta categories of algorithms based on what type of feature they extract from the vein samples. Evaluation results show that recognition algorithms that extract binary images from the blood vessel patterns as a feature image are susceptible to all of the attack types. The other feature types, i.e., texture based approaches, minutiae based approaches and keypoint based approaches are only somewhat susceptible to the VeraFV, SCUT and VeraPV attack samples.

In the second part of this research, it is tested whether similarity scores that were generated in the course of the threat evaluation could be combined using score-level-fusion in order to thwart the attacks. To do so, exhaustive cross combination of all recognition schemes together with seven fusion strategies and six score normalization techniques was carried out. Experimental results show that indeed a sound attack detection can be achieved with this strategy.

References is not available for this document.

Attack Detection for Finger and Palm Vein Biometrics by Fusion of Multiple Recognition Algorithms

Abstract:

Metadata

Abstract:

Funding Agency:

Introduction

Databases