Loading [MathJax]/extensions/MathMenu.js
MCLFIQ: Mobile Contactless Fingerprint Image Quality | IEEE Journals & Magazine | IEEE Xplore

Abstract:

We propose MCLFIQ: Mobile Contactless Fingerprint Image Quality, the first quality assessment algorithm for mobile contactless fingerprint samples. To this end, we re-tra...Show More

Abstract:

We propose MCLFIQ: Mobile Contactless Fingerprint Image Quality, the first quality assessment algorithm for mobile contactless fingerprint samples. To this end, we re-trained the NIST Fingerprint Image Quality (NFIQ) 2 method, which was originally designed for contact-based fingerprints, with a synthetic contactless fingerprint database. We evaluate the predictive performance of the resulting MCLFIQ model in terms of Error-vs.-Discard Characteristic (EDC) curves on three real-world contactless fingerprint databases using three recognition algorithms. In experiments, the MCLFIQ method is compared against the original NFIQ 2 method, a sharpness-based quality assessment algorithm developed for contactless fingerprint images and the general purpose image quality assessment method BRISQUE. Furthermore, benchmarks on four contact-based fingerprint datasets are also conducted. Obtained results show that the fine-tuning of NFIQ 2 on synthetic contactless fingerprints is a viable alternative to training on real databases. Moreover, the evaluation shows that our MCLFIQ method works more accurately and is more robust compared to all baseline methods on contactless fingerprints. We suggest considering the proposed MCLFIQ method as a starting point for the development of a new standard algorithm for contactless fingerprint quality assessment.
Page(s): 272 - 287
Date of Publication: 18 March 2024
Electronic ISSN: 2637-6407

Funding Agency:

Figures are not available for this document.

SECTION I.

Introduction

In the past years, contactless fingerprint recognition has been introduced as a more convenient alternative to contact-based schemes [1], [2], [3]. In contrast to contact-based capturing schemes where the finger is pressed onto a planar surface, contactless recognition workflows do not require any contact between the subject and the capturing subsystem. This avoids distinct problems like low contrast caused by dirt, humidity on the capturing device, or latent fingerprints. Moreover, contactless fingerprint recognition schemes typically have a higher user acceptance, especially in multi-user scenarios, where different individuals share one capture device. In said cases, the subjects might have fewer hygienic concerns using contactless fingerprint recognition [4].

A wide variety of different capturing setups have been developed for contactless fingerprint recognition. The range of capturing devices reaches from expensive stationary devices capturing 3D samples to lightweight mobile setups. However, it can be observed from the literature that most contactless fingerprint capturing devices are mobile handheld devices like smartphones [1], [3], [4]. The contactless fingerprint recognition workflow, especially in mobile capturing scenarios, suffers from an inferior biometric performance, which is mainly caused by a more challenging capturing process. External influences like illumination or the distance between the capturing device and the fingertip have a major impact on the quality of a captured sample.

The signals obtained from a mobile contactless fingerprint capturing device are most commonly 2D color images, which cannot be directly fed into a recognition workflow. Firstly, an elaborated pre-processing is required in order to transfer the captured finger image to a contactless fingerprint sample. This task typically includes steps like gray scale conversion, ridge-line enhancement, normalization and rotation [1].

Both, the capturing subsystem and the pre-processing can negatively impact the recognition performance. Figure 1 illustrates two contactless fingerprints of the same subject obtained from a smartphone-based capturing setup. The Figures 1(a) and 1(b) depict the segmented finger image together with the corresponding final contactless fingerprint sample after enhancement. From Figure 1 we can see that the two samples are of different quality: Figure 1(a) of rather high quality and Figure 1(b) of low quality. The low-quality sample could be the result of a capturing attempt in a challenging environmental situation. The resulting finger image lacks a visible ridge-line characteristic and is not usable for further processing. For this reason, a precise and robust quality assessment tool for contactless fingerprints is of high interest to assess if a candidate sample is of sufficient quality for a recognition workflow. Further, actionable feedback may be provided to the capture subject or the biometric attendant to initiate a re-capture of the finger image.

Fig. 1. - Two fingerprint samples with pre-processed images of same subject: (a) high quality, (b) low quality.
Fig. 1.

Two fingerprint samples with pre-processed images of same subject: (a) high quality, (b) low quality.

Quality in general is defined as “being suitable for the intended purpose” or the “fitness for purpose”. With a proper quality assessment, system operators want to ensure that their service operates as specified. To achieve a reliable and reproducible attribution of quality, operational definitions should be established to achieve an objective and automated quality assessment. Within the context of biometric recognition, quality assessment refers to the mapping of an individual biometric signal to a numerical value, whereas higher values indicate a better quality and thus predict a stronger recognition performance. The ISO/IEC 29794–1 [5] defines guidelines for a utility-based biometric sample quality assessment, which includes three aspects:

  • Character: Character is an expression of quality based on the inherent properties of the source from which the biometric sample is derived. E.g., a scarred finger has a poor character.

  • Fidelity: Fidelity reflects the degree of the sample similarity to its source. E.g., a capturing device with low resolution captures a sample of low fidelity.

  • Utility: Utility signifies the predicted influence of a sample on a biometric system’s recognition performance. Biometric utility is contingent on the sample’s characteristics and fidelity.

It is noteworthy that a precise distinction between the influence of character and fidelity is not feasible in many cases. E.g., a dirty finger is neither attributed to character nor fidelity. Furthermore, distinct identification workflows may be more robust against challenges regarding either character or fidelity than others. For this reason, ISO/IEC 29794–1 [5] suggests considering the utility for sample quality assessment.

Figure 1 illustrates both, the impact of character and fidelity on a contactless fingerprint sample. In Figure 1(a) some artifacts of death flaking off skin can be seen which refer to the character. However, in this case, the impact of character on the utility is minor. The sample depicted in 1(b) has a rather low contrast, which impacts the fidelity. From the corresponding pre-processed image, it is observable that barely any ridge line characteristic is extractable. Sample quality assessment is of major importance for increasing the performance of a biometric system. A quality assessment algorithm ensures that only samples of high quality are processed in the biometric system. An accurate and robust quality assessment algorithm usually enables a biometric system to be operated at lower error rates. This will in turn enhance the system’s security associated with the False Acceptance Rate (FAR) and user comfort associated with the False Rejection Rate (FRR).

For contact-based fingerprint recognition, NFIQ 2 is the reference implementation to ISO/IEC 29794–4 which has been established as the de facto standard [6]. NFIQ 2 is an open-source software which was implemented under the leadership of the National Institute of Standards and Technology (NIST). The software linksimage quality of optical and ink scanned 500 PPI fingerprints to operational recognition performance. NFIQ 2 consists of 74 quality features, which are formally standardized in ISO/IEC 29794–4 [7]. A random forest classifier maps the individual quality measures to a unified quality score in the range [0,100].

It should be noted that NFIQ 2 refers to the general redesign of NFIQ as documented in NISTIR 8382 [8], whereas NFIQ 2.0 – NFIQ 2.2 refer to a specific release.1

In 2021, a virtual Workshop on Fingerprint Image Quality (NFIQ 2.1) was organized by the European Association for Biometrics (EAB) in cooperation with NIST and other institutions.2 Throughout this workshop, the importance of a reliable quality assessment for fingerprint images was emphasized. Moreover, the speakers and panelists formulated the interest of extending the scope of NFIQ 2 to other capturing technologies like contactless fingerprints. Up to now, no proposal of such an algorithm has yet been made. The main reasons are the lack of a suitable database for training machine learning algorithms.

A. Contribution

In this work, we address the aforementioned demands and present MCLFIQ, the first quality assessment algorithm for contactless fingerprint images captured with mobile devices.

The main hypothesis of this work is that quality components for contact-based fingerprints, as defined in NFIQ 2 and standardized in ISO/IEC 29794–4 [7], are also highly suitable for contactless fingerprints. This is plausible because the general fingerprint characteristics, i.e., the ridge and valley patterns, remain the same for contact-based and contactless fingerprints. However, due to the contactless capturing process, the relevance of individual quality components changes. Hence, the weighting of each quality component has to be readjusted to effectively predict the sample quality. The re-training of NFIQ 2, i.e., a re-weighting of the quality components, is shown to be a viable method to readjust the quality algorithm to the challenges of distinct capturing methods. The resulting MCLFIQ algorithm outperforms existing quality assessment methods for contactless fingerprint samples, which confirms our hypothesis. The main steps along the development of MCLFIQ, which represents an adaptation of the NFIQ 2 framework for mobile contactless fingerprint images, are summarized as follows:

  • First, we define framework conditions for pre-processing contactless fingerprint images in order to achieve sample consistency and to make all images suitable for quality assessment using our method.

  • We then iteratively re-trained the random forest included in the NFIQ 2 framework using a synthetically generated contactless fingerprint database. Here, with every iteration step, the amount and appearance of the training data is adjusted in order to optimize the evaluation results.

  • Finally, we test our newly generated model, referred to as MCLFIQ. For this, we use three real-world databases, which include contactless and contact-based fingerprints: the ISPFDv1 database [17], the AIT database [16] and the HDA database [4]. We consider three recognition workflows for the evaluation: one Commercial-Off-The-Shelf (COTS) system, one that is based on open-source algorithms and the NIST NBIS framework. Our MCLFIQ model is benchmarked against the latest version of NFIQ 2 (NFIQ 2.2), a quality metric that is based on sharpness and was developed for contactless fingerprint images and the re-trained No-Reference Image Quality Assessment (NR-IQA) algorithm BRISQUE. We report the predictive performance in terms of Error vs. Discard Characteristic (EDC) curves [18] for every combination of database, quality assessment algorithm and recognition workflow and analyze the EDC Partial Area Under Curve (EDC PAUC).

Based on our investigations, we suggest further investigating contactless fingerprint quality assessment and considering the proposed MCLFIQ method as a starting point for a new standard algorithm. For this reason, the MCLFIQ model is made publicly available so that interested researchers can download and test MCLFIQ on their own databases and benchmark it against NFIQ 2.2 or other methods. Furthermore, we will provide the pre-processing pipeline so that it can be used and refined for other databases.3

The rest of the paper is structured as follows: Section II discusses the related work. In Section III aspects of quality assessment for contactless fingerprint recognition are presented and the applicability of NFIQ 2 is evaluated. Section IV introduces our proposed system. In Section V the experimental setup is explained based on which the experimental results are summarized in Section VI. Finally, Section VII concludes the paper.

SECTION II.

Related Work

Very few works investigate the sample quality of contactless fingerprint samples. Table I gives an overview of the most relevant works proposed so far.

TABLE I Overview on Published Works on the Topic of Contactless Fingerprint Quality Assessment. It Should be Noted That No Information Is Available on the Internal Functionality of the Verifinger Quality Metric Used by Wild et al. [9]
Table I- Overview on Published Works on the Topic of Contactless Fingerprint Quality Assessment. It Should be Noted That No Information Is Available on the Internal Functionality of the Verifinger Quality Metric Used by Wild et al. [9]

Parziale and Chen [10] suggested a coherence-based quality measurement. This approach measures the strength of the dominant direction in a local region. For this purpose, the authors applied a normalized coherence estimation on local gradients of the gray level intensity. Moreover, the covariance matrix of the gradient vectors was denoted, which represents the clarity of the ridge line structure. The algorithm is applied in different block sizes, whereas the individual results are averaged. The resulting “global quality index” is an open interval which is not normalized. On their test database, the authors tested the proposed quality algorithm and divided the histogram into three equal-sized groups. In experiments, it was shown that the accuracy on partitions assessed with high quality is better than on partitions assessed with lower quality.

Labati et al. [11] implemented 45 different quality features which include, among others, the fingerprint Region Of Interest (ROI), various Fourier features and Gabor features. The authors studied different subsets of the implemented features and proposed the most promising feature vectors for their database. A feed-forward neural network and a k-Nearest-Neighbor (kNN) classifier were used to aggregate the individual feature vectors to a final quality score. In contrast to the aforementioned work, the authors achieved a closed interval quality scores. The authors used a rather constrained data set and compared their method to NFIQ 1.0 [19]. It was shown that their own approach performed significantly better than the NFIQ 1.0 algorithm. However, it remains unclear if these findings generalize.

Li et al. [12], [13] introduced a quality assessment algorithm for finger images acquired with smartphones. The authors used different metrics in the spatial and frequency domain, which resulted in a feature vector. A Support Vector Machine (SVM) was trained to separate high-quality blocks from those with low quality. Like Parziale and Chen, the authors did not normalize the algorithm to achieve a closed interval of quality scores. They also grouped the considered database into three partitions according to quality scores. Again, the EER is lower on partitions of higher quality.

Liu et al. [14] evaluated generic quality factors for different contactless modalities, including contactless fingerprints. They conclude that contrast, sharpness, luminance and artifacts like sensor noise or compression artifacts are the most important factors. This general assessment highly corresponds to the findings of the other authors. It can be seen that sharpness and contrast related features like Fourier transformations (c.f. [11], [12], [13]), Gabor filters (c.f. [10], [11]) or image entropy (c.f. [11]) are most suitable. Also, using different block sizes appears to be promising for a robust assessment (c.f. [10], [11]).

The capabilities of using the contact-based quality assessment algorithms NFIQ 1.0, NFIQ 2.0 and one which is included in the Veridium SDK were evaluated by Wild et al. [9]. The authors used a self-acquired database to evaluate the algorithms. In their experiments, the authors filtered samples based on quality scores and reported EERs on the left-over subsets. From their results, it is observable, that all algorithms showed the intended behavior of assigning low-quality values to samples which have a low biometric performance. The results also indicate that all algorithms might perform better on contact-based samples.

A preliminary evaluation of NFIQ 2.0 on contactless fingerprint and contact-based fingerprint databases was conducted by Priesnitz et al. [15]. The authors evaluated NFIQ 2.0 on publicly available data and reported its predictive power in terms of EDC curves. The study indicates that NFIQ 2.0 is, in general, suitable for the assessment of contactless samples, but a proper pre-processing is crucial for a high predictive power. However, the predictive performance of NFIQ 2.0 on contactless data is, in general, worse compared to the tested contact-based data. From that, we can conclude that NFIQ 2.0 in its current version is not suitable for contactless fingerprint quality assessment.

For contactless fingerprint recognition schemes, a major challenges are a narrow field of depth, which may cause a de-focused fingerprint region of interest, low-quality camera setups and a blur caused by a finger movement during the capturing attempt. Kauba et al. [16] discussed a Canny filter-based quality assessment algorithm which analyzes the sharpness. Nevertheless, these methods have to be normalized in certain quality ranges and hardly generalize to new capturing schemes.

From the literature review, we can observe several weaknesses in the state-of-the-art. Many proposals of new algorithms were only evaluated on one database, which is not publicly available. Also, from the evaluation methodology used in the previous works, no clear conclusion regarding the predictive performance of the suggested quality assessment method can be extracted. Most studies only divide the tested database into subsets according to quality scores and report lower error rates on subsets of higher quality. Furthermore, many of the proposed algorithms do not consider any fingerprint-related features, instead they focus on sharpness or contrast measures. Therefore, it is assumed that sharp images with a high-fidelity result in high-quality scores, which do not correspond to utility because fingerprint character is neglected. Moreover, it is observed that NFIQ 2 shows many advantages over other proposed algorithms, e.g., quality features that have proven beneficial for a robust quality assessment. Furthermore, the included random forest classifier can be re-trained for the special characteristics, e.g., caused by the capturing setup. For this reason, NFIQ 2 is considered the most promising framework for a proposal for a contactless fingerprint quality assessment.

SECTION III.

Biometric Sample Quality for Contactless Fingerprint Images

This section introduces prerequisites and requirements for utility-driven quality assessment. Finally, the general suitability of features included in NFIQ 2 is evaluated.

A. Prerequisites for Quality Assessment

State-of-the-art mobile contactless fingerprint recognition setups as proposed in [4], [16], [20] typically implement an automatic workflow which captures four inner-hand fingers. The capturing subsystem has to ensure that the captured image fulfills certain prerequisites to be suitable for further processing.

  • Finger separation: In a four finger capturing scenario, the finger ROIs have to be separated from each other. Here, the captured hand photo is separated into four individual finger images. This task can be done e.g., by deep learning-based methods [21] or an on-screen guidance [20].

  • Size of the fingerprint ROI: In mobile contactless capturing scenarios, the distance between the capturing device and the hand can be freely chosen. However, if the distance is too high, the resolution of the ROI is too low for an extraction of the ridge pattern. On-screen guidance and user feedback can effectively avoid this capturing failure.

  • Brightness and contrast of the ROI: Especially in capturing scenarios with unconstrained illumination, the capturing subsystem has to ensure that the captured ROI has a proper brightness and contrast which allows an extraction of the ridge pattern. Here, an additional light source, e.g., the flashlight of a smartphone, is considered to be beneficial.

  • Sharpness: Contactless capturing schemes are vulnerable to sharpness related issues caused by movement and lens properties. In capturing scenarios with low environmental light, the shutter speed is slow and therefore the captured image can contain motion blur. Furthermore, the aperture of the lens is very high, which leads to a very narrow depth of field. This can lead to de-focused images with no clear ridge pattern.

  • Finger positioning: In the unconstrained capturing setups the fingers can be presented to the camera at various angles. Here, yaw and pitch angles can be easily corrected, whereas rotations around the roll angle are a major challenge. A strong deviation from a central finger perspective leads to shifted minutiae positions and hence degraded accuracy.

B. Applicability of NFIQ 2 for Contactless Fingerprints

NFIQ 2 is the de facto standard quality assessment algorithm for contact-based fingerprinting at a resolution of 500 PPI.. The random forest classifier in combination with hand-crafted quality components included in NFIQ 2 offers distinct advantages over deep-learning approaches, e.g., CNN-based methods:

  • Deep-learning-based methods do not provide an easy way to give actionable feedback to the user. Furthermore, it can be challenging to interpret the decision-making process. In contrast, with NFIQ 2, it is possible to make well-founded decisions based on a subset of quality components.

  • Deep-learning models typically generalize rather well, especially when trained on large, diverse and representative datasets. To the authors’ knowledge, suitable databases for the training of a deep-learning-based quality assessment algorithms do not exist. It is especially noteworthy that a candidate for a new standard algorithm should accurately predict the sample quality for all samples it is designed for.

The goal of NFIQ 2 is to provide an algorithm that is suitable for all recognition workflows for which it was designed. NFIQ 2 incorporates 74 unique hand-crafted quality components that have a high predictive performance and comprehensively cover important aspects of fingerprint images, such as minutiae count, contrast, sample clarity and size of the ROI. Furthermore, NFIQ 2 is able to provide actionable feedback based on individual features to the user, e.g., if a fingerprint sample is blurred or lacks contrast.

The NFIQ 2 feature vector also includes measures which are highly sensitive to the sharpness of a fingerprint sample. It uses a linear regression function to determine a gray level threshold and classifies the pixels as ridge or valley. Afterward, the local clarity score is computed as the block-wise clarity based on those ridges and valleys. The orientation certainty level indicates whether a fingerprint sample contains a clear ridge-line structure or not. For this, the strength of the energy concentration along the ridge flow is analyzed. If a fingerprint sample is rather sharp, the orientation certainty level is higher, which subsequently indicates a higher utility. Also, the component ROI Orientation Map Coherence Sum computes as features the coherence map of the orientation field estimation by analyzing oriented patterns as described in [22]. Many of the NFIQ 2-features are directly comparable to features which have been considered for contactless fingerprint quality assessments in previous works, c.f. Section II. The entire list of features can be seen in Table XII.

TABLE II Mobile Contactless Fingerprint Databases Considered for the Evaluation. It Should be Noted That the AIT Database Was Captured in Seven Sessions Under Different Environmental Conditions
Table II- Mobile Contactless Fingerprint Databases Considered for the Evaluation. It Should be Noted That the AIT Database Was Captured in Seven Sessions Under Different Environmental Conditions
TABLE III Overview on the Evaluation Process, Including Relevant Metrics
Table III- Overview on the Evaluation Process, Including Relevant Metrics
TABLE IV Overview on the Ten Most Important Features of MCLFIQ and the Original NFIQ 2.2 incl. Their Relative Importance
Table IV- Overview on the Ten Most Important Features of MCLFIQ and the Original NFIQ 2.2 incl. Their Relative Importance
TABLE V EERs Obtained on the Considered Subsets Using Two Different Recognition Workflows. Note That the Labels of the Test Datasets are Introduced in Table II
Table V- EERs Obtained on the Considered Subsets Using Two Different Recognition Workflows. Note That the Labels of the Test Datasets are Introduced in Table II
TABLE VI EDC PAUC the Range [0, 0.2] Obtained From the Contactless Databases Using IDKIT
Table VI- EDC PAUC the Range [0, 0.2] Obtained From the Contactless Databases Using IDKIT
TABLE VII EDC PAUC in Range [0, 0.2] Obtained From the Contactless Databases Using the Open-Source Method
Table VII- EDC PAUC in Range [0, 0.2] Obtained From the Contactless Databases Using the Open-Source Method
TABLE VIII EDC PAUC in Range [0, 0.2] Obtained From the Contactless Databases Using the NBIS Algorithms
Table VIII- EDC PAUC in Range [0, 0.2] Obtained From the Contactless Databases Using the NBIS Algorithms
TABLE IX EDC PAUC in Range [0, 0.2] Obtained From the Contact-Based Databases Using IDKit
Table IX- EDC PAUC in Range [0, 0.2] Obtained From the Contact-Based Databases Using IDKit
TABLE X EDC PAUC in Range [0, 0.2] Obtained From the Contact-Based Databases Using the Open-Source Method
Table X- EDC PAUC in Range [0, 0.2] Obtained From the Contact-Based Databases Using the Open-Source Method
TABLE XI EDC PAUC in Range [0, 0.2] Obtained From the Contact-Based Databases Using the NBIS Algorithms
Table XI- EDC PAUC in Range [0, 0.2] Obtained From the Contact-Based Databases Using the NBIS Algorithms
TABLE XII Feature Importance of MCLFIQ and NFIQ 2.2
Table XII- Feature Importance of MCLFIQ and NFIQ 2.2

Since many features included in NFIQ 2 accurately assess quality measures of contactless fingerprints, NFIQ 2 is, in general, applicable for contactless fingerprints. However, its predictive performance is degraded compared to its performance on contact-based fingerprints [15]. This is because the random forest is not trained on contactless fingerprints.

As discussed, contactless fingerprints present different challenges compared to contact-based ones. For this reason, the importance of each individual feature has to be adjusted to improve the predictive power. This can be achieved by re-training the random forest classifier of NFIQ 2.

For training NFIQ 2, annotated data is required. Since the training uses a binary classification algorithm, the training data has to be labelled with labels for high and low fingerprint quality. Then, the training of the random forest consists of two steps: firstly, the feature extraction sub-system computes all features for the labelled training data and secondly, the random forest determines a configuration for every tree. From the final configuration, we can determine the importance of every feature, which indicates how much influence each single feature has on the final unified quality score.

It should be noted that NFIQ 2 refers to the general redesign as documented in NISTIR 8382 [8], whereas NFIQ 2.2 refers to an individual release.4 NFIQ 2.2 was the latest version at the time the experiments were conducted.

Using NFIQ 2 as the basis for a contactless fingerprint quality assessment algorithm has several advantages. We can use a set of well-engineered, tested and ISO/IEC 29794–4 compliant features which have been precisely calibrated on fingerprint samples [7]. Thus, we are consistent with the vast majority of contactless fingerprint recognition schemes that use feature extraction algorithms from the contact-based domain.

SECTION IV.

The MCLFIQ Method

The NFIQ 2 framework is designed in a way that it is possible to adjust it to special characteristics of fingerprint images, e.g., samples captured with different sensor types. Specifically, the random forest parameters are trained on data captured by the target capturing device type. Here, the included quality features and the range of quality scores remain the same, which makes different models highly comparable to each other. For this reason, the NFIQ 2 re-training framework is highly capable of proposing a new model for mobile contactless fingerprint images.

A. Sample Pre-Processing

Contactless finger images which fulfil the prerequisites discussed in Section III-A are not automatically aligned to the requirements of quality assessment and recognition. For this reason, contactless fingerprint images need to be pre-processed in order to make them processable with established algorithms. A contactless fingerprint pre-processing pipeline is a set of algorithms which transfer the colored contactless finger image into a fingerprint sample. Many combinations of pre-processing algorithms show advantages and drawbacks on different databases. For this reason, we define framework requirements for a contactless fingerprint sample instead of specifying concrete algorithms:

  • Rotation: The fingerprint sample should be rotated to an upright position.

  • Cropping: The sample should only contain the fingerprint area. Also, the fingerprint image should be cropped approximately at the first finger knuckle.

  • Normalization: The sample should be normalized to a ridge-line frequency of approximately 8 – 12 pixels [7].

  • Background separation: The fingerprint sample should be precisely separated from the background so that the non-fingerprint area is white.

  • Gray scale conversion: The fingerprint sample should contain only grayscale values.

  • Emphasized ridge pattern: The ridge pattern should be emphasized to align with a contact-based fingerprint.

These requirements highly align with recommendations of many established tools from the contactless and contact-based domain. Also, many works propose pre-processing workflows which align to these requirements [23], [24], [25], [26], [27], [28]. From the cited literature, it is also observable that different capturing methods are used and that the pre-processing algorithms are optimized for the dedicated capturing setup.

B. Training Process

For the MCLFIQ training, we use the NFIQ 2 framework and benefit from its advantages of an established random forest framework and standardized quality features for fingerprint samples. Nevertheless, MCLFIQ is a unique model since we re-trained NFIQ 2 on an entirely different database and, hence, changing its use case from contact-based to contactless fingerprints. An overview of the conducted training steps can be seen in Figure 2.

Fig. 2. - Overview of the MCLFIQ training workflow using the NFIQ 2 framework and important metrics for training and evaluation of the Random Forest.
Fig. 2.

Overview of the MCLFIQ training workflow using the NFIQ 2 framework and important metrics for training and evaluation of the Random Forest.

The training process requires a data set of contactless fingerprint images which includes samples of high and low quality. To the author’s knowledge, there are only a few contactless fingerprint databases publicly available, which are too small to train the NFIQ 2 random forest classifier. Also, the training and testing should be conducted on different databases, to give a fair indication of predictive performance.

Moreover, labels which indicate whether a sample is of high or low quality are required. An algorithm for labeling biometric training data is proposed in [29]. However, this approach may be biased to the used comparison algorithm and may not be robust against miss-labeling, which negatively impacts the predictive performance of the proposed system. Alternatively, experts could label the fingerprints manually, which is time-consuming and requires an in-depth domain knowledge. For this reason, it is impractical to manually annotate large datasets.

Especially in the context of limited real data, using synthetic data for training the random forest classifier is an appropriate alternative [30]. Another advantage is that all available real-world databases remain available for testing purposes. We use the method described in [31] for generating a mobile contactless fingerprint database for the training of MCLFIQ. SynCoLFinGer is a synthetic contactless fingerprint generator which aims to generate samples captured by smartphones. SynCoLFinGer is based on a modelling approach which uses a SFinGe [32] ridge pattern and applies various filters like deformations, distortions and noises to it which simulate a contactless capturing, subject characteristics such as skin color and environmental influences. For each filter, an intensity between 0 (low impact) and 100 (high impact) can be configured. Subsequently, the combination of all filter intensities defines the utility of the generated sample. For this reason, SynCoLFinGer can be precisely adapted to generate a well-suited training database of heterogeneous quality.

SECTION V.

Experimental Setup

This section presents the experimental setup which is required to implement and evaluate MCLFIQ. First, the training and evaluation databases are introduced. Further, algorithms for pre-processing and recognition are described. An overview of the considered evaluation workflow can be seen in Figure 3.

Fig. 3. - Example images of the training database generated by SynCoLFinGer with corresponding pre-processed images and quality parameters: (a) high-quality preset and low (b) low-quality preset.
Fig. 3.

Example images of the training database generated by SynCoLFinGer with corresponding pre-processed images and quality parameters: (a) high-quality preset and low (b) low-quality preset.

A. Training Database

Due to the lack of publicly available databases, the contactless fingerprint training database considered in the work is generated synthetically by the SynCoLFinGer method [31]. SynCoLFinGer includes the generation of contactless finger images of distinct quality. For this, the generation can be configured with a simple configuration parameter between 0 (low quality) and 100 (high quality). From this configuration parameter, the parameters for subject characteristics and environmental influences are derived. It should be noted that this configuration parameter is not correlated to NFIQ 2 scores.

B. Evaluation Databases

From a set of ridge-patterns, we generate two subsets: one of high quality and one of low quality. For this, the configuration parameters for the low-quality subset are set to a range between 0 and 33, whereas the high-quality subset is generated with a preset of values between 66 and 100. Figure 3 presents sample images of high and low quality, which are included in the training database. As can be seen from the picture, high-quality samples contain e.g., less rotation distortions and a clearer ridge-line pattern, whereas a low-quality samples are distorted more and show more noise and dirt artifacts. It should be noted that due to random variables which are incorporated in SynCoLFinGer, the training database also contains samples of moderate quality. This method automatically generates the ground truth by assigning high and low-quality labels to the samples.

For our experiments, we employ three different contactless fingerprint evaluation databases. All databases are captured using smartphones in a mobile scenario. Used databases and their properties are listed in Table II and briefly summarized as follows:

AIT Database [16]: The dataset consists of 14 subjects whose four inner hand fingers were recorded in 7 different scenarios. The scenarios consist of two office-like environments, four open-air scenarios and one cellar scenario to simulate nighttime recording. The acquisition was carried out using an iPhone 11, which recorded videos. In total, 196 videos were recorded. Each video with a duration between 10 and 15 seconds was split in two parts of equal duration. For each video part, the fingerprint with the highest sharpness within the first five seconds was selected and extracted. As a result, the dataset is composed of 1,568 contactless fingerprint samples in total. The database also contains a subset of contact-based samples. Example images of the database are presented in Figure 4. Further details about the dataset can be found in [16].

Fig. 4. - Example images of the AIT mobile database captured in scenario 6 (lattice): (a) before pre-processing, (b) after pre-processing, (c) contact-based.
Fig. 4.

Example images of the AIT mobile database captured in scenario 6 (lattice): (a) before pre-processing, (b) after pre-processing, (c) contact-based.

HDA Database [4]: The HDA database consists of contactless samples captured in two different setups. A box-setup simulates a constrained dark environment, whereas a tripod-setup simulates an unconstrained capturing scenario. For the capturing, we used two different smartphones: the Google Pixel 4 (constrained scenario) and the Huawei P20 Pro (unconstrained scenario). An application automatically captured the four inner-hand fingers and processed them to fingerprint samples. During the database acquisition, contact-based samples were also captured. Example images of the database are presented in Figure 5.

Fig. 5. - Example images of the HDA database taken from the unconstrained capturing scenario: (a) before pre-processing, (b) after pre-processing, (c) contact-based.
Fig. 5.

Example images of the HDA database taken from the unconstrained capturing scenario: (a) before pre-processing, (b) after pre-processing, (c) contact-based.

ISPFD [17]: the IIITD SmartPhone Fingerphoto Database v1 consists of contactless fingerprint images acquired using an Apple iPhone 5 smartphone. It includes finger images captured in indoor and outdoor scenarios with natural and white background. Figure 6 depicts example images of the ISPFD database.

Fig. 6. - Example images of the ISPFD database taken from the natural outdoor sub-database: (a) before pre-processing, (b) after pre-processing, (c) contact-based.
Fig. 6.

Example images of the ISPFD database taken from the natural outdoor sub-database: (a) before pre-processing, (b) after pre-processing, (c) contact-based.

FVC2006 [33]: the database of the fourth international Fingerprint Verification Competition (FVC) contains four disjoint fingerprint subsets. The first three subsets are each collected with a different contact-based sensor, while the fourth database is synthetically generated. We only use the subsets DB2 and DB3 as the others are not considered useful for our experiments. Example images of the FVC2006 database are depicted in Figure 7.

Fig. 7. - Examples of the FVC2006 database: (a) DB2, (b) DB3.
Fig. 7.

Examples of the FVC2006 database: (a) DB2, (b) DB3.

It should be noted that all considered contactless databases fulfill the prerequisites discussed in Section III-A. However, they are rather small compared to real-world application scenarios and do not represent a typical population.

C. Database Pre-Processing

To extract features from contactless fingerprints with tools designed for the contact-based domain, a pre-processing has to be applied which transfers a contactless finger image to a contact-based equivalent sample. According to our suggestion in Section IV-A, we use the same pre-processing pipeline for all databases to achieve a consistent impression on all samples.

Since the ISPFD database contains unsegmented and un-rotated finger photos, the fingerprint region of interest is segmented by a deep-learning-based semantic method [21]. The method uses a DeepLabv3+ model, which was fine-tuned for the segmentation of fingertips. The segmented finger image is then rotated to an upright position. All other databases provide already segmented and rotated fingerprint images, so this step is omitted.

The segmented data is then converted to gray scale and a Contrast Limited Adaptive Histogram Equalization (CLAHE) is applied to emphasize the ridge-line characteristics. The CLAHE algorithm is iteratively applied with decreasing size of tile grids. First, this process equalizes the brightness throughout the fingerprint region and second, emphasizes the ridge pattern. Next, the fingerprint samples are normalized to a fixed ridge-line frequency of approximately 9 pixels, which aligns to approximately 500 PPI live-scanned fingerprints and is favored by NFIQ 2 and the recognition algorithms. Here, the ridge-to-ridge distance is measured and the fingerprint image is re-sized accordingly. All images are converted into a uniform file format to fulfill the requirements of NFIQ 2 and the recognition workflows.

D. Training Process

The NFIQ 2 training framework is maintained by the International Organization for Standardization (ISO). The framework provides a process, which consists of steps for data labelling, training the random forest parameters and an evaluation of the random forest.

The labeled training database of our final training attempt consists of 40,000 synthetic samples, 30,000 for training and 10,000 for evaluation. The database is generated to consist of 50% high-quality samples and 50% low-quality samples (c.f. Section V-A).

During the training process, a new random forest is built based on the labeled training data. The training parameters (100 trees in the random forest, maximum depth of each tree of 25, 10 randomly sampled variables as split candidates, minimum sample count per leave of 2 and tree pruning) are the same as in NFIQ 2. During the validation process, the automatic assignment of quality labels has proven to work accurately, since only six samples have been miss-classified. These samples were generated with the high-quality preset, but validated by the NFIQ 2 framework as low quality. We manually re-labeled them and thus got a final training database of 19,994 high-quality samples and 20,006 low-quality samples.

The trained random forest outputs a class membership along with its probability. The final NFIQ 2 score is the probability that a given image belongs to class 1 multiplied by 100 and rounded to its closest integer.

E. Considered Baseline Algorithms

To evaluate the predictive performance of MCLFIQ in comparison to established quality assessment algorithms, we select NFIQ 2.2, a sharpness-based quality estimation algorithm introduced by the AIT [16] and BRISQUE [34].

As discussed in Section II, it is also possible to assess the quality of contactless fingerprints using NFIQ 2. Even though it is not designed for this use case, it includes many quality features which are also of high relevance for contactless samples. Additionally, the practical applicability on contactless fingerprints has been shown in [15].

As a second algorithm, we adapted a sharpness-based quality assessment algorithm introduced by Kauba et al. [16]. It works as follows: firstly, all fingerprints are scaled to the same image width in order to reduce the effect of the distance between fingertip and camera sensor on the sharpness calculation (Figure 8(a)). Secondly, an elliptical mask overlays the fingerprint image sample. The mask consists of two nested ellipses, and only the area between both ellipses is considered for calculating the sharpness (Figure 8(b)). Thirdly, the Canny edge detector [35] is applied for edge detection (Figure 8(c)). Finally, the sharpness value is the ratio of the number of summed edges and the size of valid pixels as defined by the mask. We normalize the resulting floating-point sharpness value to an integer between 0 and 100 in order to integrate it into our workflow.

Fig. 8. - Visualization of the sharpness-based quality estimation. (a) original fingerprint image (converted to gray scale), (b) superimposed elliptical mask, (c) computed Canny edges.
Fig. 8.

Visualization of the sharpness-based quality estimation. (a) original fingerprint image (converted to gray scale), (b) superimposed elliptical mask, (c) computed Canny edges.

As a third quality assessment algorithm, we employ the blind/referenceless image spatial quality evaluator (BRISQUE) introduced by Mittal et al. [34]. BRISQUE is a no-reference image quality assessment algorithm designed to evaluate the naturalness and quality of images. The classification task is done by a Support Vector Machine (SVM) Regressor (SVR). We re-trained BRISQUE using the same SynCoLFinGer database as for MCLFIQ. Like for MCLFIQ, the quality annotations in a range of [0,100] of the training data are directly generated by SynCoLFinGer. It should be noted that no preprocessing, i.e., no gray-scale conversion, was conducted for the training data. Accordingly, for testing, the samples were only segmented, cropped and rotated.

F. Recognition Algorithms

For our experiments, we use three recognition algorithms, a Commercial Off-The-Shelf (COTS) system and two open-source fingerprint recognition systems.

1) Commercial Off-the-Shelf (COTS) System:

The fingerprint recognition system IDKit SDK from Innovatrics5 is used as COTS software. The system is originally designed for contact-based fingerprint samples, but has also proven to work robustly with pre-processed contactless samples [16].

2) Open-Source Fingerprint Recognition System:

The first considered open-source fingerprint recognition system is based on the FingerNet feature extractor of Tang et al. [36] and a minutiae pairing and scoring algorithm of the SourceAFIS system of Važan [37]. The original algorithm uses minutiae quadruplets, i.e., additionally considers the minutiae type (ridge ending or bifurcation). Since minutiae triplets are extracted by the used minutiae extractors, the algorithm has been modified to ignore the type information since the SourceAFIS system does not support this information.

3) NIST NBIS Framework:

The second considered open-source fingerprint recognition system is the NBIS framework6 developed by NIST. We used the MINDTCT tool to extract minutiae information and BOZORTH3 to compute the template comparison scores. It should be noted that all experiments were conducted using the default parameters without any optimizations, like minutiae quality threshold adaptations.

G. Measuring Biometric Sample Quality

The aspects (c.f. Section I) of biometric quality have to be expressed in an objective manner to ensure that performance can be measured and compared between different systems. Tabassi et al. [19] proposed an approach for objective performance assessment based on a measure of the distance between the mated and non-mated comparison score distributions for a given sample. Well separated distributions imply that the likelihood of false accept or false reject is low, and that it increases with greater overlap between the distributions. This approach is generalized in ISO/IEC 29794-1:2009 [38] which requires that the quality score output of a biometric quality assessment algorithm conveys the predicted utility of the biometric sample.

For evaluating the predictive power of a quality assessment algorithm of a biometric recognition system, Grother and Tabassi [39] introduced the Error vs. Reject Curve (ERC). This method evaluates whether a rejection of low quality samples results in a reduced False-Non-Match error Rate (FNMR). Each mated comparison is associated with a similarity score $s_{ii}$ and two quality scores $q_{i}^{(1) }$ and $q_{i}^{(2) }$ . In order to aggregate the pair of quality scores from a pair of samples to be compared, the $\min $ function is chosen as combination function:\begin{equation*} q_{i} = \min \left ({q_{i}^{(1) }, q_{i}^{(2) }}\right) \tag{1}\end{equation*} View SourceRight-click on figure for MathML and additional features.

Then a set $R(u)$ is formed containing the pairwise minima which are less than a fixed threshold of acceptable quality u:\begin{equation*} R(u) = \left \{{i: \min \left ({q_{i}^{(1) }, q_{i}^{(2) }}\right) < u }\right \} \tag{2}\end{equation*} View SourceRight-click on figure for MathML and additional features.

Subsequently, $R(u)$ is used to exclude comparison scores and compute the FNMR on the rest. Starting with the lowest of the pairwise minima, comparisons are excluded up to a threshold t obtained using the empirical cumulative distribution function of the comparison scores, corresponding to a FNMR of interest denoted by f:\begin{equation*} t = M^{-1}(1-f) \tag{3}\end{equation*} View SourceRight-click on figure for MathML and additional features. The ERC is then computed by iteratively excluding a portion of samples and recomputing the FNMR on the remaining comparison scores which are below the threshold:\begin{equation*} {\mathrm{ FNMR}}(t,u) = \frac {\left |{\left \{{s_{ii}: s_{ii} \le t,i \notin R(u)}\right \}}\right |}{\left |{\left \{{s_{ii}: s_{ii} \le \infty }\right \}}\right |} \tag{4}\end{equation*} View SourceRight-click on figure for MathML and additional features.

Due to the effect that a fraction of low-quality samples are excluded in every iteration step, the FNMR should decrease constantly if the quality measure is a good predictor for the biometric performance. This method is widely adopted by the research community and is also known as Error vs. Discard Characteristic (EDC) curve [18].

In order to compare different EDCs, the area under each curve is computed up to a pre-defined discard rate and denoted as Error vs. Discard Characteristic Partial Area Under Curve (EDC PAUC). Here, the threshold is set to $x = 0.2$ to only consider the most relevant part of the curve.

SECTION VI.

Results

In this section, we present the results of the MCLFIQ training and validation. Furthermore, we discuss the evaluation in terms of biometric performance and predictive power.

A. Training and Validation Results

Two important identifiers for the training accuracy are the training error rate and the out-of-bag error. The training error shows the number of samples which cannot be predicted correctly according to their ground truth labels. The out-of-bag error defines the mean prediction error averaged over each training sample, using only the trees that did not have the sample in their bootstrap.

We aim to align our training and validation results to the results of the original NFIQ 2.2 training. For this reason, we designed our final training database to have zero training error and a low out-of-bag error of 0.0009. In comparison, NFIQ 2.2 has a training error of zero and an out-of-bag error of 0.24 [8].

The validation determines how the random forest generalizes to unseen data. Here, the validation error rate shows how many samples are mispredicted according to their labels. The validation error rate is 0 for both models. As discussed, we optimized the training database to achieve the same results as in the original NFIQ 2.2 model.

During the random forest training process, the importance of every individual feature is adjusted. This means that during the training process, it is evaluated which feature has a high share of the correct attribution of a quality score and which does not. The overview of the relative importance of all features included in the method indicates which features have the highest relevance for the quality assessment.

Table IV presents the 10 most important features of NFIQ 2.2 and MCLFIQ. We can observe that NFIQ 2.2 has a more uniformly distributed feature importance, whereas the MCLFIQ model relies mainly on a few features which have a high importance. In particular, the two features ROI Relative Orientation Map Coherence Sum and Orientation Certainty Level Mean combined share over 50% of the whole feature importance. Both features are mainly based on sharpness measures, which is considered the most crucial point for contactless fingerprint sample quality assessment.

In contrast, the 10 most important features of NFIQ 2.2 share only approx. 33% feature importance. The most important feature Frequency Domain Analysis Standard Deviation represents a one-dimensional signature of the ridge-valley structure. From Table IV we can also see that the NFIQ 2.2 model puts high importance on minutiae count and quality (e.g., FingerJet FX OSE COM Minutiae Count and FingerJet FX OSE OCL Minutiae Quality) as quality features. A possible cause for this is that NFIQ 2 was trained on a contact-based fingerprint database that contains a large portion of partial fingerprints, which include fewer minutiae.

From the obtained importance map, we can conclude that the most important features in MCLFIQ merely address the fidelity of a fingerprint sample, c.f. Section I, whereas the most important features of NFIQ 2.2 include both character and fidelity. This is plausible because the contactless capturing process poses significantly more challenges than the contact-based one, which directly addresses fidelity. In other words, it can be summarized that fidelity is a greater challenge for contactless fingerprints compared to character in most cases.

Furthermore, we compare the sizes of both models. MCLFIQ is only 295.6 KB, whereas the NFIQ 2.2 model has a size of 52.9 MB. This is mainly caused by the unbalanced feature importance. Many trees in the random forest are very shallow, which leads to a smaller model. It should be noted that also the time required to load the model is positively affected, which is especially beneficial for mobile and embedded devices.

B. Biometric Performance

First, we discuss the biometric performance of each database in combination with both recognition workflows. From Figure 9 and Table V we can observe two general trends: First, we see that the contact-based databases in general have a lower EER compared to the contactless ones. Second, the considered identification workflows have different performance on the tested databases. The IDKit algorithm works rather robustly on all the databases, whereas the open-source workflows fall behind. Here, FingerNet combined with SourceAFIS is still more accurate than NBIS.

Fig. 9. - DET curves obtained on the considered databases using all recognition workflows.
Fig. 9.

DET curves obtained on the considered databases using all recognition workflows.

The DET plots in Figure 9 show the challenging characteristics of all the considered databases. Except for the ISPFD WO, LS sub-database and the FVC2006 DB2 which have a good performance, all databases show a fair performance. Especially, the databases that were captured in an indoor environment (ISPFD WI and both HDA databases) have a poor performance. It should also be noted that the COTS system achieves lower EERs compared to the open-source workflow, especially on challenging data. This challenging characteristic of the databases is highly suited for our experiments because the predictive power of the EDC method can be evaluated best on databases of heterogeneous quality. This means high-quality gains can be achieved by discarding samples of low quality.

C. Predictive Power

We evaluate the predictive power of each quality assessment algorithm in terms of EDC curves, as introduced in Section III. For the EDC computations, it is required to set an initial FNMR. Here, a good practice is to consider approximately the EER as initial FNMR. For this reason, we set the initial FNMR to 0.25% for all experiments. For better comparability, we also report the EDC PAUC, which refers to the area under the curve in the range between [0, 0.2].

The EDC curves (c.f. Figure 10) show that all considered quality assessment algorithms show reasonable results on some of the tested databases.

Fig. 10. - EDC curves obtained from the considered databases using the three quality assessment algorithms and the three recognition workflows. The EDC PAUC denotes the area which is considered during the EDC PAUC calculation. (OS: open-source recognition workflow, AIT: AIT sharpness metric).
Fig. 10.

EDC curves obtained from the considered databases using the three quality assessment algorithms and the three recognition workflows. The EDC PAUC denotes the area which is considered during the EDC PAUC calculation. (OS: open-source recognition workflow, AIT: AIT sharpness metric).

However, the results indicate that the re-trained BRISQUE performs poorly on all contactless databases except HDA constrained. Also, the HDA unconstrained sub-database in combination with AIT sharpness and IDKit indicates poor performance. All other EDC curves decrease from the starting point, which indicates a lower FNMR by discarding samples that were identified as low quality by the quality assessment method. Additionally, there is no huge difference between the EDCs obtained by using the COTS algorithm and those obtained by both other methods. From this, we can summarize that the predictive power of the considered quality assessment algorithm is independent of the used recognition workflow.

In more detail, the EDC curves also show that MCLFIQ performs best if the average of every EDC PAUC is considered, c.f. Tables VI–​VIII Especially on the ISPFD database, NFIQ 2.2 has an inferior performance compared to both other methods. The AIT sharpness metric performs slightly better on the ISPFD NI sub-database, but worse on all other databases. In summary, MCLFIQ has the best overall performance on the ISPFD database.

Considering the HDA database, it is observable that the EDC curves are not decreasing as monotonously as the others. Also, on this database, the recognition workflow seems to have a major impact on the predictive power. These findings can be attributed to the small total number of samples in the database with the highly challenging characteristic of the database. On the constrained subset, we can see that NFIQ 2.2 performs worst, whereas the AIT sharpness and MCLFIQ are very close. Most notably, the predictive performance of the open-source workflow together with MCLFIQ performs better than all other combinations. On the unconstrained sub-database, the predictive power of every assessment algorithm is better in combination with the open-source workflow than with IDKit. Here, MCLFIQ and NFIQ 2.2 have a comparable EDC PAUC, whereas the AIT sharpness algorithm is worse.

The EDCs obtained by the AIT mobile database are very close together. It can be seen that the AIT sharpness metric performs slightly worse compared to MCLFIQ and NFIQ 2.2. Again, all quality assessment algorithms seem to have a better predictive power in combination with the open-source workflow.

To conduct a comprehensive evaluation, we also conduct a counterexperiment by benchmarking the considered quality assessment algorithms on contact-based databases. Figure 11 presents the results in terms of EDC curves. For MCLFIQ and NFIQ 2.2, the obtained results are as expected: NFIQ 2.2 shows in general a lower EDC PAUC than MCLFIQ, c.f. Tables IX–​XI. Notably, both the AIT sharpness metric and the BRISQUE algorithm show good results in some experiments. Most likely, this is caused by special database properties observed in the FVC2006 DB3 and should be further investigated.

Fig. 11. - EDC curves obtained from the considered databases using the three quality assessment algorithms and the three recognition workflows. The EDC PAUC denotes the area which is considered during the EDC PAUC calculation. It should be noted that FVC2006 DB3 it is not possible to compute reasonable BRISQUE scores, which is why the curves are missing. (OS: open-source recognition workflow, AIT: AIT sharpness metric).
Fig. 11.

EDC curves obtained from the considered databases using the three quality assessment algorithms and the three recognition workflows. The EDC PAUC denotes the area which is considered during the EDC PAUC calculation. It should be noted that FVC2006 DB3 it is not possible to compute reasonable BRISQUE scores, which is why the curves are missing. (OS: open-source recognition workflow, AIT: AIT sharpness metric).

Figure 12 presents a visual representation of the average EDC PAUC, including the standard deviations obtained from the experiments. From the charts, we can see that the predictive power achieved by MCLFIQ is, on average, much lower compared to NFIQ 2.2 and the AIT sharpness metric, whereas BRISQE shows degraded performance. Also, the standard deviation is much lower compared to the AIT sharpness metric. NFIQ 2.2 combined with IDKit has a slightly lower standard deviation at a worse predictive power. The results, quantitatively confirm that MCLFIQ works more accurately and more robustly compared to all baselines.

Fig. 12. - Overview of the average EDC PAUC incl. standard deviation obtained using the different quality assessment algorithms and recognition workflows.
Fig. 12.

Overview of the average EDC PAUC incl. standard deviation obtained using the different quality assessment algorithms and recognition workflows.

SECTION VII.

Conclusion and Future Work

Contactless fingerprint recognition has gained a lot of attention in recent years. However, quality assessment of contactless fingerprints remains a not yet sufficiently covered research topic.

In this work, we formulate the hypothesis that quality components for contact-based fingerprints, as defined in NFIQ 2, are also highly suitable for contactless fingerprints. Furthermore, the re-training of NFIQ 2 effectively optimizes the weights of the quality components for the distinct challenges of contactless fingerprints.

Our experimental results confirm this hypothesis: A training of a new random forest classifier based on NFIQ 2 is possible and synthetic data is a viable alternative to real databases. Our training results in the MCLFIQ model, which significantly outperforms all other methods in terms of predictive performance. Moreover, the MCLFIQ model shows a significantly improved robustness considering various databases and recognition workflows. Also, it is observable that sharpness is the most important quality measure for mobile contactless fingerprints. However, the amount of suitable contactless fingerprint databases is limited so that our method could only be tested on rather small databases.

Since the research area of contactless fingerprint recognition still lacks a standardized quality assessment algorithm, we suggest considering the proposed MCLFIQ method as a first baseline for research on a standardized quality assessment tool for contactless fingerprint samples. As stated, the MCLFIQ model is made publicly available to ensure the reproducibility of this work.

Further research should focus on the acquisition of a large contactless fingerprint database for training and testing quality assessment methods. It is assumed that our proposal works even better when the re-training is done on real data. Furthermore, new research directions for fingerprint quality assessment, like deep-learning-based methods, could be studied. Useful improvements have already been made for other biometric characteristics, such as face recognition, c.f. [40], which could also be applied to fingerprints. In addition, more advanced deep-learning-based techniques like attention-based NR-IQA methods [41] or perceptual image quality assessment algorithms [42] could be proper starting points for further research.

Appendix

See the Table XII.

References

References is not available for this document.