Research on face recognition based on deep learning

Ieee account.

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here .

Loading metrics

Open Access

Peer-reviewed

Research Article

A deep facial recognition system using computational intelligent algorithms

Roles Conceptualization, Data curation, Formal analysis, Methodology, Supervision, Writing – original draft, Writing – review & editing

* E-mail: [email protected]

Affiliations Department of Information Systems, Faculty of Computers and Artificial Intelligence, Benha University, Benha City, Egypt, Department of Computer Science, Faculty of Computers and Informatics, Misr International University, Cairo, Egypt

ORCID logo

Roles Conceptualization, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Writing – original draft

Affiliation Department of Information Systems, Faculty of Computers and Artificial Intelligence, Benha University, Benha City, Egypt

Roles Formal analysis, Investigation, Methodology, Software, Validation, Writing – review & editing

Affiliation Department of Computer Science, Faculty of Computers and Artificial Intelligence, Benha University, Benha City, Egypt

Roles Conceptualization, Investigation, Project administration, Writing – original draft, Writing – review & editing

Affiliations Department of Scientific Computing, Faculty of Computers and Artificial Intelligence, Benha University, Benha City, Egypt, Department of Computer Science, Higher Technological Institute, 10th of Ramadan City, Egypt

  • Diaa Salama AbdELminaam, 
  • Abdulrhman M. Almansori, 
  • Mohamed Taha, 
  • Elsayed Badr

PLOS

  • Published: December 3, 2020
  • https://doi.org/10.1371/journal.pone.0242269
  • Peer Review
  • Reader Comments

Fig 1

The development of biometric applications, such as facial recognition (FR), has recently become important in smart cities. Many scientists and engineers around the world have focused on establishing increasingly robust and accurate algorithms and methods for these types of systems and their applications in everyday life. FR is developing technology with multiple real-time applications. The goal of this paper is to develop a complete FR system using transfer learning in fog computing and cloud computing. The developed system uses deep convolutional neural networks (DCNN) because of the dominant representation; there are some conditions including occlusions, expressions, illuminations, and pose, which can affect the deep FR performance. DCNN is used to extract relevant facial features. These features allow us to compare faces between them in an efficient way. The system can be trained to recognize a set of people and to learn via an online method, by integrating the new people it processes and improving its predictions on the ones it already has. The proposed recognition method was tested with different three standard machine learning algorithms (Decision Tree (DT), K Nearest Neighbor(KNN), Support Vector Machine (SVM)). The proposed system has been evaluated using three datasets of face images (SDUMLA-HMT, 113, and CASIA) via performance metrics of accuracy, precision, sensitivity, specificity, and time. The experimental results show that the proposed method achieves superiority over other algorithms according to all parameters. The suggested algorithm results in higher accuracy (99.06%), higher precision (99.12%), higher recall (99.07%), and higher specificity (99.10%) than the comparison algorithms.

Citation: Salama AbdELminaam D, Almansori AM, Taha M, Badr E (2020) A deep facial recognition system using computational intelligent algorithms. PLoS ONE 15(12): e0242269. https://doi.org/10.1371/journal.pone.0242269

Editor: Seyedali Mirjalili, Torrens University Australia, AUSTRALIA

Received: May 28, 2020; Accepted: October 25, 2020; Published: December 3, 2020

Copyright: © 2020 Salama AbdELminaam et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the manuscript.

Funding: This study was funded by a grant from DSA Lab, Faculty of Computers and Artificial Intelligence, Benha University to author DSA (28211231302952).

Competing interests: The authors have declared that no competing interests exist.

1. Introduction

The face is considered the most critical part of the human body. Research shows that even a face can speak, and it has different words for different emotions. It plays a crucial role in interacting with people in society. It conveys people's identity and thus can be used as a key for security solutions in many organizations. The facial recognition (FR) system is increasingly trending across the world as an extraordinarily safe and reliable security technology. It is gaining significant importance and attention from thousands of corporate and government organizations because of its high level of security and reliability [ 1 – 3 ].

Moreover, the FR system is providing vast benefits compared to other biometric security solutions such as palmprints and fingerprints. The system captures biometric measurements of a person from a specific distance without interacting with the person. In crime deterrent applications, this system can help many organizations identify a person who has any kind of criminal record or other legal issues. Thus, this technology is becoming essential for numerous residential buildings and corporate organizations. This technique is based on the ability to recognize a human face and then compare the different features of the face with previously recorded faces. This feature also increases the importance of the system and enables it to be widely used across the world. It is developed with user-friendly features and operations that include different nodal points of the face. There are approximately 80 to 90 unique nodal points of a face. From these nodal points, the FR system measures significant aspects including the distance between the eyes, length of the jawline, shape of the cheekbones, and depth of the eyes. These points are measured by creating a code called the faceprint, which represents the identity of the face in the computer database. With the introduction of the latest technology, systems based on 2D graphics are now available on 3D graphics, which makes the system more accurate and increases its reliability.

Biometrics is defined as the science and technology to measure and statistically analyze biological data. They are measurable behavioral and/or physiological characteristics that could be used to verify individual identification. For each individual, a unique biometric could be used for verification. Biometric systems are used in increasingly many fields such as prison security, secured access, and forensics. Biometric systems recognize individuals using authentication by utilizing different biological features such as the face, hand geometry, iris, retina, and fingerprints. The FR system is a more natural biometric information process with better variation than any other method. Thus, FR has become a recent topic in computer science related to biometrics and machine learning [ 4 , 5 ]. Machine learning is a computer science field that gives computers the capability to learn without further explicit programming. The main focus of machine learning is providing algorithms for training to perform a task—machine learning related to the field of computational statistics and mathematical optimization. Machine learning includes multiple methods such as reinforcement learning, supervised learning, almost supervised learning, and unsupervised learning [ 6 ]. Machine learning can be used on many tasks that people think only they can do, such as playing games, learning subjects, and recognition [ 6 ]. Most machine learning algorithms consume a massive amount of resources, so it would be better to perform their tasks on a distributed environment such as cloud computing, fog computing, or edge computing.

Cloud computing is based on the shareability of many resources including services, applications, storage, servers, and networks to accomplish economies and consistency and thus provide the best concentration to maximize the efficiency of using the shared resources. Fog computing contains many services that are provided on the network edge, such as data storage, computing, data provision, and application services for end users who can be added to the network edge [ 7 ]. These environments would reduce the total amount of resource usage, speed up the completion time of tasks, and reduce costs via pay-per-use.

The main goals of this paper are to build a deep FR system using transfer learning in fog computing. This system is based on modern techniques of deep convolutional neural networks (DCNN) and machine learning. The proposed methods will be able to capture the biometric measurements of a person from a specific distance for crime deterrent purposes without interacting with the person. Thus, the proposed methods can help many organizations identify a person with any kind of criminal record or other legal issues.

The remainder of the paper is organized as follows. Section 2 presents related work in FR techniques and applications. Section 3 presents the components of traditional FR: face processing, deep feature extraction and face matching by in-depth features, machine learning, K-nearest neighbors (KNN), support vector machines (SVM), DCNN, the computing framework, fog computing, and cloud computing. Section 4 explains the proposed FR system using transfer learning in fog computing. Section 5 presents the experimental results. Section 6 provides the conclusion with the outcomes of the proposed system.

2. Literature review

Due to the significant development of machine learning, the computing environment, and recognition systems, many researchers have worked on pattern recognition and identification via different biometrics using various building mining model strategies. Some common recent works on FR systems are surveyed here in brief.

Singh, D et al. [ 8 ] proposed a COVID-19 disease classification model to classify infected patients from chest CT images. a convolutional neural network (CNN) is used to classify COVID-19-infected patients as infected (+ve) or not (−ve). Additionally, the initial parameters of CNN are tuned using multi-objective differential evolution (MODE). The results show that the proposed CNN model outperforms competitive models, i.e., ANN, ANFIS, and CNN models in terms of accuracy, F-measure, sensitivity, specificity, and Kappa statistics by 1.9789%, 2.0928%, 1.8262%, 1.6827%, and 1.9276%, respectively.

Schiller, D et al. [ 9 ] proposed a novel approach to transfer learning to automatic emotion recognition (AER) across various modalities. The proposed model used for facial expression recognition that utilizes saliency maps to transfer knowledge from an arbitrary source to a target network by mostly “hiding” non-relevant information. The proposed method is independent of the employed model since the experience is solely transferred via augmentation of the input data. The evaluation of the proposed model showed that the new model was able to adapt to the new domain faster when forced to focus on the parts of the input that were considered relevant sources Prakash, R et al. [ 10 ] proposed an automated face recognition method using Convolutional Neural Network (CNN) with a transfer learning approach. The CNN with weights learned from pre-trained model VGG-16. The extracted features are fed as input to the Fully connected layer and softmax activation for classification. Two publicly available databases of face images–Yale and AT&T are used to test the performance of the proposed method. Face recognition accuracy of 100% is achieved for AT&T database face images and 96.5% for Yale database face images. The results show that face recognition using CNN with transfer learning gives better classification accuracy in comparison with PCA method.

Deng et al. [ 11 ] proposed additive angular margin loss (ArcFace) to accomplish face acknowledgment. The proposed ArcFace has an unmistakable geometric understanding as a result of the specific correspondence to geodesic separation on a hypersphere. They also introduced the broadest exploratory assessment against the FR method utilizing ten FR datasets. They indicated that ArcFace reliably beats the best in class and can be effectively actualized with irrelevant computational overhead. The verification performance of open-sourced FR models on LFW, CALFW, and CPLFW datasets reached 99.82%, 95.45%, and 92.08%, respectively [ 11 ].

Wang et al. [ 12 ] proposed a large margin cosine loss (LMCL) by reformulating the SoftMax loss as a cosine loss by L2 normalizing the two highlights and weight vectors to evacuate outspread varieties and using the cosine edge term to expand the choice edge in precise space. They achieved the highest between-class difference and lowest intraclass fluctuation via cosine choice edge augmentation and normalization. They referred to their model, trained with LMCL, as CosFace. They based their experiment on the Labeled Face in the Wild (LFW), YouTube Faces (YTF), and MegaFace Challenge datasets. They confirmed the efficiency of their proposed approach, achieving 99.33%, 96.1%, 77.11%, and 89.88% accuracy on the LFW, YTF, MF1 Rank1, and MF1 Veri datasets, respectively [ 12 ].

Tran et al. [ 13 ] proposed a disentangled representation learning-generative adversarial network (DR-GAN) with three different developments. First, the encoder-decoder structure of the generator permits DR-GAN to gain proficiency with a discriminative and generative portrayal, including picture blending. Second, the portrayal is unraveled from other face varieties—for example, through the posture code given to the decoder and posture estimation in the discriminator. Third, DR-GAN can accept one or various pictures as information and produce one integrated portrayal alongside an arbitrary number of manufactured pictures. They tested their network using the Multi-PIE database. They contrasted their strategy and face acknowledgment techniques with Multi-PIE, CFP, and IJB-A and achieved average face confirmation exactness with greater than tenfold standard deviation. They accomplished equivalent execution on frontal-frontal confirmation with ~1.4% enhancement for frontal-profile verification [ 13 ].

Masi et al. [ 14 ] proposed to build prepared information sizes for face acknowledgment frameworks: domain explicit information development. They presented techniques to enhance realistic datasets with critical facial varieties by controlling the faces in the datasets while coordinating inquiry pictures presented by standard convolutional neural systems. They tested their framework against the LFW and IJB-A benchmarks and Janus CS2 on a large number of downloaded pictures. They reported the standard convention for unhindered, marked outside information and announced a mean grouping precision of 100% equal error rate [ 14 ].

Ding and Tao [ 15 ] proposed a far-reaching system based on convolutional neural networks (CNN) to overcome the difficulties faced in video-based face recognition (VFR). CNN learns obscure highlights by utilizing prepared information comprising misleadingly obscured information and still pictures. They proposed a trunk-branch ensemble CNN model (TBE-CNN) to improve CNN highlights to present varieties and impediments. TBE-CNN separates data from face pictures and zones picked around facial segments. TBE-CNN removes information by sharing the center and low-level convolutional layers between the branch and trunk systems. They proposed an improved triplet misfortune capacity to invigorate the influence of discriminative portrayals learned by TBE-CNN. TBE-CNN was tested on three video face databases: YouTube, COX Face, and PaSC Faces [ 15 ].

Al-Waisy, et al. [ 16 ] proposed a multimodal profound learning system that depends on nearby element presentation for k-based face acknowledgment. They consolidated the focal points of neighborhood handmade component descriptors with the DBN to report face acknowledgment in unconstrained circumstances. They proposed a multimodal nearby component extraction approach dependent on consolidating the upsides of fractal measurement with the curvelet change, and they called it the curvelet–fractal approach. The principal inspiration of this methodology is that the curvelet change can expertly present the fundamental facial structure, while the fractal measurement presents the surface descriptors of face pictures. They proposed a multimodal profound face acknowledgment (MDFR) approach, to include highlight presentation by preparing a DBN on nearby element portrayals. They compared the outcomes of the proposed MDFR approach with the curvelet–fractal approach on four face datasets: the LFW, CAS-PEAL-R1, FERET, and SDUMLA-HMT databases. The outcomes acquired from their proposed approaches outperformed different methodologies including WPCA, DBN, and LBP by accomplishing new outcomes on the four datasets [ 16 ].

Sivalingam et al. [ 17 ] proposed a proficient fractional face location strategy utilizing AlexNet CNN to detect emotions based on images of half-faces. They distinguished the key focal points and concentrated on textural highlights. They proposed an AlexNet CNN strategy to discriminatively coordinate the two removed nearby capabilities, and both the textural and geometrical data of neighborhood highlights were utilized for coordination. The comparability of two appearances was changed according to the separation between the adjusted capabilities. They tested their approach on four generally utilized face datasets and demonstrated the viability and constraints of their proposed method [ 17 ].

Jonnathann et al. [ 18 ] presented a comparison between profound learning and conventional AI strategies (for example, artificial neural networks, extreme learning machine, SVM, optimum-path forest, KNN) and deep learning. For facial biometric acknowledgment, they concentrated on CNNs. They used three datasets: AR Face, YALE, and SDUMLA-HMT [ 19 ]. Further research on FR can be found in [ 20 – 23 ].

3. Material and methods

  • Ethics Statement

All participants provided written informed consent and appropriate, photographic release. The individuals shown in Fig 1 have given written informed consent (as outlined in PLOS consent form) to publish their image.

thumbnail

  • PPT PowerPoint slide
  • PNG larger image
  • TIFF original image

https://doi.org/10.1371/journal.pone.0242269.g001

3.1 Traditional facial recognition components

The whole system comprises three modules, as shown in Fig 1 .

  • In the beginning, the face detector is utilized on videos or images to detect faces.
  • The prominent feature detector aligns each face to be normalized and recognized with the best match.
  • Finally, the face images are fed into the FR module with the aligned results.

Before inputting an image into the FR module, the image is scanned using face anti-spoofing, followed by recognition performance .

research paper of face recognition

  • where M indicates the face matching algorithm, which is used to calculate the degree of similarity,
  • F refers to extracting the feature encoded for identity information,
  • P is the face-processing stage of occlusal facial treatment, expressions, highlights, and phenomena; and
  • I i and I j are two faces in the images.

3.1.1 Face processing.

Deep learning approaches are commonly used because of their dominant representation; Ghazi and Ekenel [ 24 ] showed some conditions including occlusions, expressions, illuminations, and pose, which can affect the deep FR performance. One of the main challenges in FR applications is representing variation; in this paper, we will summarize the face-processing deep methods for poses. Similar techniques can solve other changes. The face-processing techniques are categorized as "one-to-many augmentation" and "many-to-one normalization" [ 24 ].

  • "One-to-many augmentation" : Create many images from a single image with the ability to change the situation, which helps increase the ability of deep networks to work and learn.
  • "Many-to-one normalization" : The canonical view of face images is recovered from nonfrontal-view images, after which FR is performed under controlled conditions.

3.1.2 Deep feature extraction: Network architecture.

The architectures can be categorized as a backbone and assembled networks , as shown in Table 1 , inspired by the success of ImageNet [ 25 ] and typical CNN architectures such as SENet, ResNet, GoogleNet and VGGNet. It is also used as a baseline model in FR as a full or partial implementation [ 26 – 30 ].

thumbnail

https://doi.org/10.1371/journal.pone.0242269.t001

In addition to the mainstream methods, FR is still used as an architecture design to improve efficiency. Additionally, with backbone networks as basic blocks, FR methods can be implemented in assembled networks, possibly with multiple tasks or multiple inputs. Each network is related to one type of input or one type of task. During adoption, higher performance is attained after the results of assembled networks are collected [ 30 ].

Loss Function. SoftMax loss is used as an organizing object by a supervising signal, and it improves the variation in the features. For FR, when intravariations may be larger than intervariations, SoftMax loss loses its effectiveness.

  • Euclidean-distance-based loss:

Intravariance compression and intervariance enlargement are based on the Euclidean distance.

  • Angular/cosine-margin-based loss:

Discriminative learning of facial features is performed according to angular similarity, with prominent and potentially large angular/cosine separability between the features learned.

  • SoftMax loss and its variations:

Performance is enhanced by using SoftMax loss or a modification of it.

3.1.3 Face matching by deep features.

After training the deep networks to work with massive data and an appropriate loss function, deep feature representation must be obtained by testing each of the passed images through the networks. L2 distance or cosine distance methods are most commonly used to compute feature similarity; however, for identification and verification tasks, the nearest neighbor (NN) and threshold comparison are used. Many other methods are used to process the deep features and compute facial matching with high accuracy, such as sparse representation-based classifier (SRC) and metric learning.

FR is a developed object classification; face-processing methods can also handle variations in poses, expressions, and occlusions. There are many new complicated kinds of FR related to features present in the real world, such as cross pose FR, cross-age FR, and video FR. Sometimes, more realistic datasets are constructed to simulate scenes from reality.

3.2 Machine learning

Machine learning is developed from computational learning theory and pattern recognition. A learning algorithm uses a set of samples called a training set as an input.

In general, there exist two main categories of learning: supervised and unsupervised. The objective of supervised learning is to learn the prediction of the proper output vector for any input vector. Classification tasks are applications in which the target label is a finite number in a discrete category. Defining the unsupervised learning objective is challenging. A primary objective is to find similar samples of sensible clusters identified within input data, called clustering.

3.2.1 K-nearest neighbors.

research paper of face recognition

KNN must store a large amount of training space, and this is one of the limitations that make KNN challenging to work with in a large dataset.

3.2.2 Support vector machine.

research paper of face recognition

Although we use the L1 norm for the penalty term Pn i = 1 ξi, there exist other penalty terms such as the L2 norm, which should be chosen with respect to the needs of the application. Moreover, parameter C is a hyper-parameter that can be chosen via cross-validation or Bayesian optimization. An important property of SVM is that the resulting classifier uses only a few points of training to classify a new data point, known as a support vector.

SVMs can perform nonlinear classification that detects a nonlinear hyper-plane function of the input variable in addition to performing linear classification as the input variable is mapped to a high-dimensional feature space. SVMs can perform multiclass classification in addition to binary classification [ 34 ].

SVMs are among the best off-the-shelf supervised learning models that are capable of effectively working with high-dimensional datasets and are efficient regarding memory usage due to the employment of support vectors for prediction. SVMs are useful in several real-world systems including protein classification, image classification, and handwritten character recognition.

3.3 Computing framework

The recognition system has different parts, and the computing framework is one of the essential parts for processing data. The computing framework is famous for cloud and fog computing. The application of FR can utilize a framework based on process location and application. Data in some applications must be processed after the acquisition; however, in some applications, data processing is not instantly required. Fog computing is a network architecture that supports the processing of data instantly [ 35 ].

3.3.1 Fog computing.

Cloud computing is engineered to work by relaying and transmitting information to the edge of the servers from the datacenter task. The fog computing architecture on edge servers uses this architecture, and it provides network, storage space, limited computing, and data filtering of logical intelligence and datacenters. This structure is used in fields such as military and e-health applications [ 36 , 37 ].

3.3.2 Cloud computing.

To obtain accessible data, data are sent to the datacenter for analysis and processing. A significant amount of time and effort is expended to transfer and process data in this type of architecture, indicating that it is not sufficient to work with big data. Big data processing increases the cloud server's CPU usage [ 38 ]. There are various types of cloud computing such as Infrastructure as a Service (IaaS) , Platform as a Service (PaaS) , Software as a Service (SaaS ), and Mobile Backend as a Service (MBaaS ) [ 39 ].

Big data applications such as FR require a method and design that distribute computing to process big data in a fast and repetitive way [ 40 , 41 ]. Data are divided into packages, and each package is assigned to different computers for processing. A move from the cloud to fog or distributed computing requires 1) a reduction in network loading, 2) an increase in data processing speed, 3) a decrease in CPU usage, 4) a decrease in energy consumption, and 5) higher data volume processing.

4. Proposed facial recognition system

4.1 traditional deep convolutional neural networks.

Images are expressed in terms of width (W) 227, height (H) 227, and depth (D) 3 of the colors red, green, and blue; therefore, they have a size of 227×227×3. The input color image is filtered at the first convolutional layer. This layer has 96 kernels (K) with an 11x 11x11 filter (F) and a 4-pixel stride (s). In the kernel map, the stride is the distance between the responsive field centers of neighboring neurons. The mathematical formula ((W-F+2P)/S) +1 is employed to compute the output size of the convolutional layer, where P refers to the padded pixel number, which can be as low as zero. The output volume size of the convolutional layer is ((227–11+0)/4)+1 = 55. The second input of the convolutional layer has a size of 55×55×no of filters, and therefore, the number of filters is 256 in this layer. As the work of the layers is distributed over 2 GPUs, the load is divided by 2 over all layers in each GPU. The next layer is the convolutional layer, followed by the pooling layer. Each feature map is decreased in dimensionality, and important features are retained. The type of pooling can be sum, max, average, etc. In AlexNet, a max-pooling layer is employed. Two hundred fifty-six filters (256) are input to this layer.

Krizhevsky et al. [ 11 ] developed AlexNet for the ImageNet Large-Scale Visual Recognition Challenge (ILSVRC) [ 34 ]. The first layer of AlexNet is used to filter the input image. The input image has a height (H), width (W), and depth (D) of 227×227×3; D = 3 to account for the colors red, green, and blue. The first convolutional layer is utilized to filter the input color image; it has 96 kernels (K) with an 11x11x11 filter (F) and a four-pixel stride (s). The stride is the distance between the responsive field centers of neighboring neurons in the kernel map. The formula ((W-F+2P)/S) +1 is employed to compute the output size of the convolutional layer, where P refers to the padded pixel number, which can be as low as zero. The convolutional layer output volume size is ((227–11+0)/4)+1 = 55. The second input of the convolutional layer is of size 55×55×no of filters, and the number of filters in this layer is also 256. Since the work of these layers is distributed over 2 GPUs, the load of each layer is divided by 2. The next layer is the convolutional layer, followed by the pooling layer. Each feature map dimensionality decreases, and important features are retained. The pooling method can be max, sum, average, etc. A max-pooling layer is employed in AlexNet. A total of 256 filters are the input of this layer. Each filter has a size of 5×5×256 with a stride of two pixels. When two GPUs are used, the work is divided into 55/2×55/2×256/2≈ 27×27×128 inputs for each GPU. The normalized output of the second convolutional layer is connected to the third layer, which has 384 kernels with a size of 3×3. For the fourth convolutional layer, there are 384 kernels of size 3×3, and they are divided over 2 GPUs, so the load of each GPU is 3×3×192. There are 256 kernels each of size 3×3 in the fifth convolutional layer, and they are divided over 2 GPUs, so each GPU has a load of 3×3×128. The last three convolutional layers are created without pooling layers or normalization. The outputs of these three layers are delivered as the input to two fully connected layers, where each layer has 4096 neurons. Fig 2 illustrates the architecture used in AlexNet to classify different classes with ImageNet as a training dataset [ 34 ]. DCNNs can learn from features hierarchically. A DCNN increases the image classification accuracy, especially with large datasets [ 42 ]. Since the implementation of a DCNN requires a large number of images to attain high classification rates, an insufficient number of color images among the subjects’ identification images creates an extra challenge for recognition systems [ 35 , 36 ]. A DCNN consists of neural networks with convolutional layers that perform feature extraction and classification on images [ 37 ]. The difference between the information used for testing and the original data used to train the DCNN is minimized by using a training set with different sizes or scales but the same features. The features will be extracted and classified well using a deep network [ 43 ]. Therefore, the DCNN will be useful in the task of recognition and classification. So DCNN will be utilized in the recognition and classification tasks. The AlexNet Architecture is shown in Fig 2 .

thumbnail

https://doi.org/10.1371/journal.pone.0242269.g002

4.2 Fundamentals of transfer learning

The center information on transfer learning (TL) appears in Fig 3 . The center utilizes a moderately intricate and fruitful preprepared model, prepared from an enormous information source, e.g., ImageNet, which is a large visual database developed for visual object recognition research [ 41 ]. It contains over 14,000,000 manually annotated pictures, and one million pictures are furnished with bounding boxes. ImageNet contains in excess of 20,000 classifications [ 11 ]. Ordinarily, pretrained models are prepared on a subset of ImageNet with 1,000 classes. At that point, we "moved" the scholarly information to the moderately rearranged assignments (e.g., characterizing liquor abuse and nonliquor addiction) to remove a limited quantity of private information. Two attributes are imperative to support the exchange [ 44 ]: -i. The achievement of the pretrained model can advance the prohibition of client mediation with the exhausting hyperparameter tuning of new undertakings; ii. The early layers in pretrained models can be resolved as highlight extractors that help separate low-level highlights—for example, edges, tints, shades, and surfaces. Customary TL retrains the new layers [ 13 ]. First, the pretrained model is utilized, and then the entire structure of the neural system is reprepared. Critically, the worldwide learning rate is fixed, and the moving layers will have a low factor, while recently included layers will have a high factor. The core knowledge of TL is shown in Fig 3 .

thumbnail

https://doi.org/10.1371/journal.pone.0242269.g003

4.3 Adaptive deep convolutional neural networks (the proposed face recognition system)

The proposed system consists of three essential stages, including

  • preprocessing,
  • feature extraction
  • recognition, and identification.

In preprocessing , the frame begins to capture images that must have a human face as the subject of insertion.

This image is passed to face detector module. The face detector work non detecting the human face and segment bit as region of interest. the obtained ROI continues the preprocessing steps. It is resized into the preretinal size to alignment purpose.

In the feature’s extraction , the preprocessed ROI in handled to extract feature vector using the modified version of AlexNet. The extract vector represents the significant details of the associated image.

Finally, the recognition and identification include the determination of feature vector belongs to whom subject of enrolled subject in the system’s database. Each new feature vector represents either anew subject or already registered subject. for the feature vector of ready a register subject, the system recognition the associated ID. for the feature vector of a new registered subject, the system adds new record into the connected database.

Fig 4 illustrates the general overall view of the proposed face recognition system.

thumbnail

https://doi.org/10.1371/journal.pone.0242269.g004

The system performs the steps on the face images to obtain the distinctive features of each face as follow:

All participants provided written informed consent and appropriate, photographic release. The individuals shown in Fig 5 have given written informed consent (as outlined in PLOS consent form) to publish their image.

thumbnail

https://doi.org/10.1371/journal.pone.0242269.g005

In the preprocessing step, as shown in Fig 5 , the system begins to ensure the input image is the RGP image. Align in the same size of the image. Then, the face detection step is performed. This step uses a well-known face detection mechanism, the Viola-Jones detection approach. The popularity of Viola-Jones detection stems from its ability to work well in real-time and its ability to achieve high accuracy. To detect faces in a specific image, this face detector uses detection windows with different sizes to scan the input image.

In this phase, the decision of whether there is a face window is made. Haar-like filters are used to derive simple local features that are applied to face window candidates. In Haar-like filters, the feature values are obtained easily by finding the difference between the total light intensities of the pixels. Then segmentation the region of the issue by cropping and resizing the face image to 227×227, as shown in Fig 6 .

thumbnail

https://doi.org/10.1371/journal.pone.0242269.g006

All participants provided written informed consent and appropriate, photographic release. The individuals shown in Fig 6 have given written informed consent (as outlined in PLOS consent form) to publish their image.

  • 2. Features Extraction using Pre-trained Alex Network

The accessible dataset size is inadequate to prepare another deep model from the earliest starting point, and in any case, this is not possible due to a large number of prepared pictures. To maintain objectivity in this test, we applied the exchange learning hypothesis to the preprepared engineering of AlexNet in three distinct ways. First, we expected to alter the structure. The last fully-connected layer (FCL) was updated since the first FCLs were created to perform 1,000 classifications. Twenty arbitrarily chosen classes were recorded: the scale, hairdresser chair, lorikeet, small poodle, Maltese dog, dark-striped cat, beer bottle, work station, necktie, trombone, protective crash helmet, cucumber, letterbox, pomegranate, Appenzeller, gag, snow panther, mountain bike, lock, and Diamondback. We observed that none of them were identified with the face recognition method. Thus, we could not legitimately apply AlexNet as the element extractor. Consequently, the calibration was fundamental. Since the length of yield neurons (1000) in conventional AlexNet is not equivalent to the number of classes in our task (2), we expected to have to alter the relating softmax layer and arrangement layer, as indicated by Fig 7 .

thumbnail

https://doi.org/10.1371/journal.pone.0242269.g007

In our exchange learning plan, we utilized another arbitrarily introduced completely associated layer with a number of accessible subjects in the utilized dataset(s), a softmax layer, and another characterization layer with a similar number of competitors. Fig 8 shows various kinds of available activation functions; we used softmax, since we had different information and choices depending on the most extreme scores of different outputs. Next, we set the training choices. Three properties were checked before training. First, the overall number of training iterations ought to be small for exchange learning. We initially set the number of training iterations to 6. Second, the global learning rate was set to a small estimated value of 10−4 to back learning off, since the early layers of this neural system were preprepared. Third, the learning pace of new layers was several times that of the transfer layer, since the transfer layers with preprepared loads and weights and the new layers had irregular instated loads and weights. Third, we shifted the quantities of transfer layers and tried various settings. AlexNet comprises five Conv layers (CL1, CL2, CL3, CL4, and CL5) and three completely associated layers (FCL6, FL7, and FL8).

thumbnail

https://doi.org/10.1371/journal.pone.0242269.g008

The pseudocode of the proposed algorithm is shown in algorithm 1. It starts using the original AlexNet architecture and image dataset for the subjects that were enrolled in the recognition systems. For each image in the dataset, the subject’s face is detected using Viola-Jones detection. The new face dataset is used for transfer learning. To transfer learning, we adapt to the architecture of AlexNet. Next, we train the altered architecture using the face dataset. The trained model is used in feature extraction.

we expect to overhaul the relating SoftMax layer and arrangement layer as indicated in the pseudocode of the proposed calculation (Algorithm 1).

Algorithm 1: Transfer Learning using AlexNet model

Input ← original AlexNet Net , ImageFaceSet imds

Output ← modified trained AlexNet FNet , features FSet

1.     Begin

2.         // Preprocessing Face image(s) in imds

3.         For i = 1: length(imds)

4.            img ← read(imds,i)

5.             face ← detectFace(img)

6.             img ← resize(face,[227, 227])

7.          save(imds,I,img)

8.         End for

9.         // Adapt AlexNet Structure

10.        FLayers ← Net.Layers(1:END-3)

11.         FLayers .append(new Convolutional layer)

12.         FLayers . append(new SoftMax layer)

13.        FLayers. append(new Classification layer)

14.         // Train FNet using options

15.         Options.set(SolverOptimizer ← stochastic gradient descent with momentum)

16.         Options.set(InitialLearnRate ←1e-3)

17.         Options.set(LearnRateSchedule ← Piecewise)

18.         Options.set(MiniBatchSize ←32)

19.         Options.set(MaxEpochs ←6)

20.         FNet ← trainNetwork(FLayers, imds, Options)

21.         //Use FNet to extract features

22.        FSet ← empty

23.         For j = 1: length(imds)

24.            img ← read(imds,j)

25.             F ← extract(FNet, img, ‘FC7’)

26.             FSet ← FSet U F

27.     End for

  • 3. Face recognition Phase using Fog and Cloud Computing:

Fig 9 shows the fog computing face recognition framework. Fog systems comprise client devices, cloud nodes/servers, and distributed computing environments. The general differences from the conventional distributed computing process are as follows:

  • A distributed computing community oversees and controls numerous cloud nodes/servers.
  • Fog nodes/servers situated at the edge of the system between the system community and the client have a specific procurement device that can perform preprocessing and highlight extraction tasks and can communicate biometric data securely with the client devices and cloud node.
  • User devices are heterogeneous and include advanced mobile phones, personal computers (PCs), hubs, and other networkable terminals.

thumbnail

https://doi.org/10.1371/journal.pone.0242269.g009

There are multiple purposes behind the communication plan.

  • From the viewpoint of recognition efficiency, if FR information is sent to a node, the system communication cost will increase, since all information must be sent to and prepared by the cloud server. Additionally, the calculation load on the cloud server will increase.
  • From the point of view of recognition security, the cloud community, as the focal hub of the whole system, will become a target for attacks. If the focal hub is breached, information acquired from the fog nodes/servers becomes vulnerable.
  • Face recognition datasets are required for training if a neural system is utilized for recognition. Preparing datasets is normally time consuming and will greatly increase the training time if the training is carried out only by the nodes, risking the training quality.

Since the connection between a fog node and client devices is very inconsistent, we propose a general engineering system for cloud-based face recognition frameworks. This plan exploits the processing ability and capacity limit of fog nodes/servers and cloud servers.

The design incorporates preprocessing, including extraction, face recognition, and recognition-based security. The plan is partitioned into 6 layers as indicated by the information stream of fog architecture shown in Fig 10 :

  • User equipment layer : The FC/MEC client devices are heterogeneous, including PCs and smart terminals. These devices may use various fog nodes/servers through various conventions.
  • Network layer : This connects administration through various fog architecture protocols. It is able to obtain information transmitted from the system and client device layer and to compress and transmit the information.
  • Data processing layer : The essential task of this layer is to preprocess image(s) sent from client hardware, including information cleaning, filtering, and preprocessing. The task of this layer is performed on cloud nodes.
  • Extraction layer : After the image(s) are preprocessed, the extraction layer utilizes the related AlexNet to remove the highlights.
  • Analysis layer : This layer communicates through the cloud. Its primary task is to cluster the removed element vectors that were found by fog nodes/servers. It can coordinate data among registered clients and produces responses to requests.
  • Management layer : The management in the cloud server is, for the most part, responsible for(1) the choices and responses of the face recognition framework and (2) the information and logs of the fog nodes/servers that can be stored to facilitate recognition and authentication.

thumbnail

https://doi.org/10.1371/journal.pone.0242269.g010

All participants provided written informed consent and appropriate, photographic release. The individuals shown in Fig 11 , Fig 12 have given written informed consent (as outlined in PLOS consent form) to publish their image.

thumbnail

https://doi.org/10.1371/journal.pone.0242269.g011

thumbnail

https://doi.org/10.1371/journal.pone.0242269.g012

As shown in Fig 11 , the recognition classifier of the Analysis layer is the most significant piece of the framework for data preparation. It is identified with the resulting cloud server response to guarantee the legitimacy of the framework. Relatedly, our work centres around recognition and authentication. Classifiers on fog nodes/servers can utilize their calculation ability and capacity limit for recognition. In any case, much of the scope information cannot be handled or stored because of the restricted calculation and capacity of fog nodes/servers. Moreover, as mentioned, sending classifiers on fog nodes/servers cannot meet the needs of an individual system. The cloud server has a greater storage capacity than fog nodes/servers; therefore, the cloud server can store many training sets and process these sets. It can send training sets to fog nodes/servers progressively for training with the goal that different fog nodes/servers receive appropriate sets.

Fig 12 shows Face images of SDUMLA-HMT subjects under different conditions as a dataset example.

5. Experimental results

In this section, we provide the results we obtained in the experiments. Some of these results will be presented as graphs, which present the relation between the performance and some of the parameters previously mentioned.

5.1 Runtime environment

The proposed recognition system was implemented and developed using MatlabR2018a on a PC with an Intel Core i7 CPU running at 2.2 GHz and Windows 10 Professional 64-bit edition. The proposed system is based on the dataset SDUMLA-HMT, which is available online for free.

5.2 Dataset(s)

SDUMLA-HMT is a publicly available database that has been used to evaluate the proposed system. The SDUMLA-HMT database was collected in 2010 by Shandong University, Jinan, China. It consists of five subdatabases—face, iris, finger vein, fingerprint, and gait—and contains 106 subjects (61 males and 45 females) with ages ranging between 17 and 31 years. In this work, we have used the face and iris databases only [ 19 ].

The face database was built using seven digital cameras. Each camera was used to capture the face of every subject with different poses (three images), different expressions (four images), and different accessories (one image with a hat and one image with glasses), and under different illumination conditions (three images). The face database consists of 106×7×(3+4+2+3) = 8,904 images. All face images are of 640×480 pixels and are stored in the BMP format. Some face images of subject number 69 under different conditions are shown in Fig [ 19 ].

5.3 Performance measure

It is obviously, researchers recently focus on enhancing the face recognition systems from accuracy metrics regardless of the latest technologies and computing environment. Today, cloud computing and fog computing are available to enhance the performance of face recognition and decrease time complexity. In the proposed framework, we will handle these issues and well considered. The classifier performance evaluator carries out various performance measures and classifies the FR accuracy as true positive (TP), false negative (FN), false positive (FP) and true negative (TN). Precision is the most interesting and sensitive measure that can be used in wide-range comparison of the essential individual classifiers and the proposed system.

research paper of face recognition

  • True Negative (TN): These are the negative tuples that were correctly labeled by the classifier.
  • True Positive (TP): These are the positive tuples that were correctly labeled by the classifier.
  • False Positive (FP): These are the negative tuples that were incorrectly labeled as positive.
  • False Negative (FN): These are the positive tuples that were mislabeled as negative.

5.4 Results & discussion

A set of experiments were performed to evaluate the proposed system in terms of the evaluation criteria. All experiments start by loading the color images from the data source, then passing them to the segmentation step. According to the pretrained AlexNet, the input image size cannot exceed 227×227, and the image depth limit is 3. Therefore, after segmentation, we performed a check step to guarantee the appropriateness of the image size. A resizing process to 227×227×3 for width, height, and depth is imperative if the size of the image exceeds the size limit. And the main parameters and ratios are represented in Table 2 .

thumbnail

https://doi.org/10.1371/journal.pone.0242269.t002

  • The experimental outcomes of the developed FR system and its comparison with various other techniques are presented in the scenario. It has been noted that the outcomes of the proposed algorithm outperformed most of its peers, especially in terms of precision.

5.4.1 Recognition time results

Fig 13 shows the comparison of the four algorithms: decision tree (DT), KNN classifier, SVM, and the proposed DCNN powered by the pre-trained AlexNet classifier. The relationship between two Parameters, observation/sec and recognition time in seconds per observation, which are used respectively for comparisons.

thumbnail

https://doi.org/10.1371/journal.pone.0242269.g013

  • The results show that the proposed DCNN has superiority over other machine learning algorithms according to observation/sec and recognition time

5.4.2 Precision results.

Fig 14 shows the precision of the four algorithms using the three datasets SDUMLA-HMT, 113, and CASIA.

thumbnail

https://doi.org/10.1371/journal.pone.0242269.g014

  • The results show that the proposed DCNN has superiority over other machine learning algorithms according to Perception for the 2 nd and 3 rd datasets and obtain with SVM the best results for the 1 st dataset.

5.4.3 Recall results.

Fig 15 shows the recall of the four algorithms using the three datasets SDUMLA-HMT, 113, and CASIA.

thumbnail

https://doi.org/10.1371/journal.pone.0242269.g015

  • The results show that the proposed DCNN has superiority over other machine learning algorithms, according to Recall parameters.

5.4.4 Accuracy results

Fig 16 displays the accuracy of our proposed system of the four algorithms using three datasets SDUMLA-HMT, 113, and

thumbnail

https://doi.org/10.1371/journal.pone.0242269.g016

  • The results show that the proposed DCNN has superiority over other machine learning algorithms, according to Accuracy parameters.

5.4.5 Specificity results.

Fig 17 displays the data of the specificity of our proposed system comparing with other four algorithms using three datasets SDUMLA-HMT, 113, and CASIA.

thumbnail

https://doi.org/10.1371/journal.pone.0242269.g017

Table 3 shows the average results for precision, recall, accuracy, and specificity of the four algorithms using the three datasets SDUMLA-HMT, 113, and CASIA.

thumbnail

https://doi.org/10.1371/journal.pone.0242269.t003

Fig 18 displays the data documented in Table representing the average results for precision, recall, accuracy, and specificity of our proposed system of the four algorithms using three datasets SDUMLA-HMT, 113, and CASIA.

thumbnail

https://doi.org/10.1371/journal.pone.0242269.g018

Table 4 shows the comparison of the three algorithms and the algorithm developed by Jonnathann et al. [ 15 ] using the same dataset. The Table 4 compares the accuracy rates of the developed classifiers verse the same classifiers developed by Jonnathann et al. [ 15 ] in terms of accuracy rates without considering feature extraction methods.

thumbnail

https://doi.org/10.1371/journal.pone.0242269.t004

Fig 19 shows the data documented in Table. It is noticeable that the proposed classifier achieves the highest accuracy using KNN, SVM, and DCNN.

thumbnail

https://doi.org/10.1371/journal.pone.0242269.g019

6. Conclusion

FR a more natural biometric information process than other proposed systems, and it must address more variation than any other method. It is one of the most famous combinatorial optimization problems. Solving this problem in a reasonable time requires an efficient optimization method. FR may face many difficulties and challenges in terms of the input image such as different facial expressions, subjects wearing hats or glasses and varying brightness levels. This study is based on the adaptive version of the most recent DCNN algorithm, called AlexNet. This paper proposed a deep FR learning method using TL in fog computing. The proposed DCNN algorithm is based on a set of steps to process the face images to obtain the distinctive features of the face. These steps are divided by preprocessing, face detection, and feature extraction. The proposed method improves the solution by adjusting the parameters to search for the final optimal solution. In this study, the proposed algorithm and other popular machine learning algorithms, including the DT, KNN, and SVM algorithms, were tested on three standard benchmark datasets to demonstrate the efficiency and effectiveness of the proposed DCNN in solving the FR problem. These datasets were characterized by various numbers of images, including males and females. The proposed algorithm and other algorithms were tested on different images in the first dataset, and the results demonstrated the effectiveness of the DCNN algorithm in terms of achieving the optimal solution (i.e., the best accuracy) with reasonable accuracy, recall, precision, and specificity compared to the other algorithms. At the same time, the proposed DCNN achieved the best accuracy compared with Jonnathann et al. [ 18 ]. The accuracy of the proposed method reached 99.4%, compared with 97.26% by Jonnathann et al. [ 18 ]. The suggested algorithm results in higher accuracy (99.06%), higher precision (99.12%), higher recall (99.07%), and higher specificity (99.10%) than the comparison algorithms.

Based on the experimental results and performance analysis of various test images (i.e., 30 images), the results showed that the proposed algorithm could be used to effectively locate an optimal solution within a reasonable time compared with other popular algorithms. In the future, we plan to improve this algorithm in two ways. The first is by comparing the proposed algorithm with different recent metaheuristic algorithms and testing the methods with the remaining instances from each dataset. The second is by applying the proposed algorithm to real-life FR problems in a specific domain.

  • View Article
  • PubMed/NCBI
  • Google Scholar
  • 7. Gamaleldin AM. An introduction to cloud computing concepts. Egypt: Software Engineering Competence Center; 2013. https://doi.org/10.1016/j.aju.2012.12.001 pmid:26579251
  • 10. Prakash, R. Meena, N. Thenmoezhi, and M. Gayathri. "Face Recognition with Convolutional Neural Network and Transfer Learning." In 2019 International Conference on Smart Systems and Inventive Technology (ICSSIT), pp. 861–864. IEEE, 2019.
  • 11. Deng J, Guo J, Xue N, Zafeiriou S, ArcFace: Additive angular margin loss for deep face recognition. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR). Long Beach, CA: IEEE; 2019. pp. 4685–4694.
  • 12. Wang H, Wang Y, Zhou Z, Ji X, Gong D, Zhou J, et al., CosFace: Large margin cosine loss for deep face recognition. In: 2018 IEEE/CVF conference on computer vision and pattern recognition. Salt Lake City, UT: IEEE; 2018. pp. 5265–5274.
  • 13. Tran L, Yin X, Liu X, Disentangled representation learning GAN for pose-invariant face recognition. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR). Honolulu, HI: IEEE; 2017. pp. 1415–1424.
  • 14. Masi I, Tran AT, Hassner T, Leksut JT, Medioni G. Do we really need to collect millions of faces for effective face recognition? In: Leibe B, Matas J, Sebe N, Welling M, editors. European conference on computer vision (ECCV). Cham, Switzerland: Springer; 2016. pp. 579–596.
  • 19. Yin Y, Liu L, Sun X, SDUMLA-HMT: A multimodal biometric database. In: Chinese conference on biometric recognition. Beijing, China: Springer; 2011. pp. 260–268.
  • 24. Ghazi MM, Ekenel HK, A comprehensive analysis of deep learning based representation for face recognition. In: 2016 IEEE conference on computer vision and pattern recognition workshops (CVPRW). Las Vegas, NV: IEEE; 2016. pp. 102–109.
  • 26. He K, Zhang X, Ren S, Sun J, Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR). Las Vegas, NV: IEEE; 2016. pp. 770–778.
  • 27. Hu J, Shen L, Sun G, Squeeze-and-excitation networks. In: 2018 IEEE/CVF conference on computer vision and pattern recognition. Salt Lake City, UT: IEEE; 2018. pp. 7132–7141.
  • 28. Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. In: Pereira F, Burges CJC, Bottou L, Weinberger KQ, editors. Advances in neural information processing systems. Nevada, USA: Curran Associates Inc.; 2012. pp. 1097–1105.
  • 29. Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:14091556. 2014.
  • 30. Szegedy C, Wei L, Yangqing J, Sermanet P, Reed S, Anguelov D, et al., Going deeper with convolutions. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR). Boston, MA: IEEE; 2015. pp. 1–9.
  • 32. Guyon I, Boser BE, Vapnik V. Automatic capacity tuning of very large VC-dimension classifiers. In: Hanson SJ, Cowan JD, Giles CL, editors. Advances in neural information processing systems. San Mateo, CA: Morgan Kaufmann Publishers Inc.; 1993. pp. 147–155.
  • 33. Schölkopf B, Smola AJ. Learning with kernels: Support vector machines, regularization, optimization, and beyond. Cambridge, MA: MIT Press; 2002. https://doi.org/10.1074/mcp.m200054-mcp200 pmid:12488466
  • 34. Cristianini N, Shawe-Taylor J. An introduction to support vector machines and other kernel-based learning methods. Cambridge, UK: Cambridge University Press; 2000.
  • 40. Nasr-Esfahani E, Samavi S, Karimi N, Soroushmehr SMR, Jafari MH, Ward K, et al., Melanoma detection by analysis of clinical images using convolutional neural network. In: 2016 38th annual international conference of the IEEE engineering in medicine and biology society (EMBC). Orlando, FL: IEEE; 2016. pp. 1373–1376.
  • 41. Pham TC, Luong CM, Visani M, Hoang VD. Deep CNN and data augmentation for skin lesion classification. In: Nguyen NT, Hoang DH, Hong TP, Pham H, Trawiński B, editors. Asian conference on intelligent information and database systems. Dong Hoi City, Vietnam: Springer; 2018. pp. 573–582.
  • 42. Deng J, Dong W, Socher R, Li L, Li K, Li FF, ImageNet: A large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. Miami, FL: IEEE; 2009. pp. 248–255.
  • 44. D. S. Abdul. Elminaam, Shaimaa ABDALLAH IBRAHIM, “Building a robust Heart Diseases Diagnose Intelligent Model Based on RST using LEM2 and MODLEM2”, in the Proceedings of the 32nd International Business Information Management Association Conference, IBIMA 2018—Vision 2020: Sustainable Economic Development and Application of Innovation Management from Regional expansion to Global Growth, PP 5733–5744, 15–16 November 2018, Seville, Spain

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • Advanced Search
  • Journal List
  • Comput Intell Neurosci
  • v.2022; 2022

Logo of cin

This article has been retracted.

Research on face recognition algorithm based on image processing.

College of Information and Communication Engineering University, Harbin 150001, Heilongjiang, China

Zhenyun Ren

Wenxi zheng, associated data.

No data were used to support this study.

While network technology is convenient for our daily life, the problems that are exposed are also endless. The most important thing for everyone is information security. In order to improve the security level of network information and identify and detect faces, the method used in this paper has improved compared with the traditional AdaBoost method and skin color method. AdaBoost detection is performed on the image, which reduces the probability of false detection. The experiment compares the experimental results of the AdaBoost method, the skin color method and the skin color + AdaBoost method. All operations in the KPCA and KFDA algorithms are performed by the inner product kernel function defined in the original space, and no specific non-linear mapping function is involved.The full name of KPCA is kernel principal component analysis. The full name of KFDA is kernel Fisher discriminant analysis. Combining the zero-space method kernel discriminant analysis method improves the ability of discriminant analysis to extract non-linear features. Through the secondary extraction of PCA features, a better recognition result than the PCA method is obtained. This paper also proposes a zero-space based Fisher discriminant analysis method. Experiments show that the zero-space-based method makes full use of the useful discriminant information in the zero space of the intraclass dispersion matrix, which improves the accuracy of face recognition to some extent.If you choose the polynomial kernel function, when d  = 0.8, KPCA has a higher recognition ability. When d  = 2, the recognition rate of KFDA and zero space-based KFDA is the largest. For polynomial functions, in general, d  = 2.

1. Introduction

In recent years, especially in areas where some terrorist attacks have been raging, various types of information identification and detection have become of great importance. Due to their own broad application areas, as well as false detections and missed inspections, it will cause our public safety. Great threats, these characteristics make information detection and recognition become more important. After the continuous exploration and practice of human beings, the emergence of biometric-based face detection and recognition technology has attracted the attention of most people. Because the biological signs are not interfered by external conditions, the formation itself is also determined by the individual's genes. The individual's genes themselves are unique and cannot be forged, and are well received by everyone. No matter which type of biological certificate, it has its own uniqueness. These uniquenesses are inherently formed by individuals and cannot be trained and forged. Later, the more mature computer image processing technology can basically detect and identify these biological signs.

Although humans can detect and identify a person from the face without difficulty in the case of great changes in expression, age, or hairstyle, it is very difficult to establish a system that can fully automate face recognition. It involves a lot of knowledge in pattern recognition, image processing, computer vision, psychology, etc., and is closely related to the identification methods based on other biometrics and computer human-computer perception interaction. At this time, computer vision technology began to enter people's field of vision as the world developed.

In order to improve the security level of network information, the face is identified and detected. In this paper, the combination of skin color and AdaBoost is used. The previous experiments and the analysis of skin color features can eliminate the complex background of nonface and perform AdaBoost detection on images. All operations in the KPCA and KFDA algorithms are performed by the inner product kernel function defined in the original space, and no specific non-linear mapping function is involved. This is the core technique of the kernel learning method. The KPCA and KFDA algorithms can be described in the same framework by constructing a corresponding linear feature space, then projecting the image into this linear space and using the resulting projection coefficients as the identified feature vector.

The introduction part of the article explains the research background and significance of the article and the research status at home and abroad. The method part describes the concept and algorithm of face detection and recognition model. The experimental part describes the data source and parameters. Conclusion The experimental data are analyzed in the Discussion section.

2. Literature Review

Based on the uniqueness and advancement of face detection and recognition technology, many research teams at home and abroad have begun in-depth research. In [ 1 ], the author proposes a new face SR method based on Tikhonov regularized neighborhood representation (TRNR). It can overcome the technical bottleneck of the patch representation scheme in the traditional neighbor-embedded image SR method. In [ 2 ], the authors evaluated methods using automated face detection techniques to help estimate site use for two chimpanzee communities trapped by cameras. The authors used a traditional manual inspection lens as a baseline to analyze the basic parameters specific to the change to evaluate the performance and practical value of chimpanzee face detection software. In [ 3 ], the author describes an automatic parking system that includes a camera mounted at the entrance/exit of the parking lot. If the camera keeps getting frames and detects faces, register them in the database. When the driver leaves, the facial image is captured again at the exit of the parking lot and compared in the database to arrive at the identity. In [ 4 ], the author first used the recently introduced 300 VW benchmark to fully evaluate the most advanced deformable face tracking pipeline. Afterwards, many different architectures were evaluated, focusing on the task of online deformable face tracking. In particular, the authors compared the following general strategies: (a) universal face detection plus general facial landmark positioning, (b) universal model free tracking plus generic facial landmark positioning, and (c) mixed state of use state art face detection, model free tracking and facial landmark positioning technology. In [ 5 ], the authors propose a method of learning salient features that responds only in the face area. Based on the salient features, the authors have also designed a joint pipeline for detecting and recognizing faces as part of the human-computer interaction (HRI) system of SRU robots. In the experiment, the article analyzes the influence of the saliency term on facial verification and the ability to discriminate against the significant features of LFW. And the experimental results of FDDB verify the effectiveness of the proposed method in face detection. In [ 6 ], the author solves these problems by proposing a new video steganography method based on Kanade-Lucas-Tomasi (KLT) tracking using Hamming codes (15, 11). Experimental results show that the method achieves higher embedding capacity and better video quality. In addition, compared with the prior art methods, the proposed algorithm improves the security and robustness of the face detection method. In [ 7 ], the authors propose a face recognition system based on low-power convolutional neural network (CNN) for user authentication in smart devices. The system consists of two chips: an always-on function CMOS image sensor (CIS) for imaging and face detection and a low-power CNN processor (CNNP) for face verification (FV). The results of the study show that the function CIS integrated with the FD accelerator can realize event-driven chip-to-chip communication of the face image only when there is a face.

In order to deeply study the characteristics and advancement of computer vision technology, many research teams at home and abroad have applied computer vision technology in different fields. In [ 8 ], the author applies computer vision technology to deep learning, and the article briefly outlines some of the most important deep learning programs used in computer vision problems. The authors briefly introduce their history, structure, strengths and limitations, then describe their application in various computer vision tasks, and briefly outline the future direction of designing deep learning solutions for computer vision problems and the challenges involved. In [ 9 ], the author applies computer vision technology to image classification, describing the steps involved in quantifying microscopic images and the different methods for each step. The authors used modern machine learning algorithms to classify, cluster, and visualize cells in HCS experiments. In addition to classification or clustering tasks, machine learning algorithms that learn feature representation have recently advanced the state of the art in several benchmarking tasks in the computer vision community. In [ 10 ], the author applied computer vision technology to the field of object recognition. Research shows that X-ray test research and development is exploring new methods based on computer vision that can be used to help operators. The article attempts to contribute to the field of object recognition in X-ray testing by evaluating different computer vision strategies proposed in the past few years. For each method, the author provides the results of the experiment displayed on the same database. In [ 11 ], the author applied computer vision technology to machine learning, especially support vector machine training, using 26 of the most common tree species in Germany as test cases, classifying specimen images, ideally at the species level. In [ 12 ], the author applied computer vision technology to cell segmentation and feature extraction. The authors outline common computer vision and machine learning methods for generating and classifying phenotype profiles, and the need for effective computational strategies for analyzing large-scale image-based data is increasing. Computer vision methods have been developed to aid in phenotypic classification and clustering of data acquired from biological images. In [ 13 ], the author applied computer vision technology to visual inspection systems, introduced a vision system for automatic measurement and detection of most types of threads, and developed many image processing and computer vision algorithms to analyze captured. In [ 14 ], the authors applied computer vision techniques to type annotations, describing the types of annotations that computer vision researchers use to crowdsource collection, and how they ensure that the data is of high quality while minimizing annotation work. Finally, the author summarizes the future of crowdsourcing in computer vision.

In [ 15 ], elated researchers have proposed a powerful framework, named Kernel-norm-based adaptive occlusion dictionary learning, for face recognition with illumination changes and occlusions. Experiments on multiple public datasets show that the NNAODL model can achieve better results than classical methods in the presence of occlusion and illumination changes. In [ 16 ], elated researchers propose a novel coupled similarity reference encoding model for age-invariant face recognition by combining nonnegatively constrained reference encoding with coupled similarity measure. Experiments using deep features are performed and high recognition rates are achieved, which shows that the model can be combined with deep networks for better results. In [ 17 ], the authors perform some implementations and comparisons of classifiers and 2D sub-space projection methods for face recognition problems. Experimental results show that using these feature matrices with CMA, SVM and CNN in classification problems is more beneficial than using raw pixel matrices in terms of processing time and memory requirements. In [ 18 ], Most of the content consists of three parts, namely specific face representation, feature extraction and classification. The face representation represents how the face is displayed and determines the progressive algorithm for detection and recognition. Evaluate face recognition, which considers shape and texture data to talk to images based on local binary patterns for personal free face recognition. In [ 19 ], the authors aim to construct facial patterns stored in a digital image database. The process of pattern construction and face recognition starts with an object in the form of a face image, side detection, pattern construction until the similarity of face patterns can be determined, and then face recognition. In this study, a program was designed to test some samples of face data stored in a digital image database so that it could provide similarity of observed face patterns and introduced them using PCA.

3.1. Face Detection and Recognition Model Based on DeepID

3.1.1. network structure.

The network structure of the Deep ID network is similar to the most basic convolutional neural network. In the Deep ID network structure, the main role of the convolutional network is to classify trained faces. The convolutional network here consists of 4 convolutional layers and 3 pooling layers, and the characteristics of the samples are represented by the last layer in the network. Using a picture as the input of the Deep ID network, the low-level features of the picture are extracted by the lower layer network and calculated layer by layer by convolution, so that the number of extracted features is gradually reduced, and the global structure of the network structure is enhanced. Sex, and can form advanced features in the top-level network structure. The Deep ID network will eventually output an advanced vector of dimension 160, which is highly dense and contains a wealth of authentication information that can be used directly for identification.

3.1.2. Calculation Process

As mentioned before, the network structure of Deep ID includes 4 convolutional layers and 3 pooling layers. Among them, after the first three convolution layers, there is a pooling layer. Behind the fourth convolutional layer, the structure implementation is directly connected to the fully connected layer and through this layer forms an output layer and features for classification. The input picture is divided into categories such as scale, channel, range, etc., and the training process of each vector is relatively independent. Finally, all vectors are connected to obtain the final vector.

3.1.3. Joint Bayesian Model

The joint Bayesian model is also widely used in the field of face recognition. The idea of joint Bayesian model comes from Bayesian face recognition, which mainly consists of dividing a face into two parts, one part is human and human. The difference between the parts is the difference of the individual itself, such as the difference caused by external conditions such as expressions and angles. Then, there are

In the formula, μ ∝ N (0, S μ ), represents the difference between people, that is, external differences; ε ∝ N (0, S ε ), represents the difference of the individual itself due to other factors, that is, internal differences. Also assume that both parts are subject to a Gaussian distribution. And by calculating the covariance of the two parts, you can get

At this point, the EM iteration of the above formula can get the similarity between the two:

3.2. AdaBoost Face Recognition Algorithm Based on Skin Color Segmentation

3.2.1. color space.

The RGB color space is a color space established by using three kinds of monochromatic light of red (700.0 nm), green (546.1 nm), and blue (453.8 nm) as a coordinate system. According to the principle of three primary colors, in the RGB color space, any color light F can be expressed as

The RGB color space is based on a Cartesian coordinate system. The three axes of the three-dimensional space correspond to the three primary colors respectively. The coordinate origin corresponds to black, and the corresponding three components R , G , and B are zero. The origin diagonal corresponds to white, and the corresponding three components R , G , and B . The maximum, the three components of the two points on the line R , G , B are equal, corresponding to the gray pixel points, that is, the gray line. The remaining three vertices of the cube correspond to cyan at R  = 0, purple at G  = 0, and yellow at B  = 0. In RGB space, if the value of two pixel points [ R 1 , G 1 , B 1 ], [ R 2 , G 2 , B 2 ] are proportional, i.e.,

The above formula shows that such proportional points have the same color and different brightness, and by normalization, the chrominance components can be removed to obtain the [ r , g , b ] space, namely,

3.2.2. Skin Color Segmentation

According to the skin color samples in the YCrCb space, the mean value C b , the mean m of the C r , and the covariance matrix C are calculated by the following formula to obtain

The number of skin color pixels counted by N in the above two formulas. Finally, the Gaussian skin color model is defined by the elliptic Gauss joint probability density function:

where x is the color vector, m and C are the average vector and the covariance matrix, respectively. The probability of the function P ( x |skin) is the skin color similarity of each pixel, which can be used to determine whether it is skin color. Finally, through the threshold setting, an image of the skin color segmentation is obtained. The mean and variance are

In this paper, the threshold method is used to realize simple and fast calculation. It is a commonly used image binarization method. The essence is to use statistical information to determine the segmentation threshold for segmentation. Accurate segmentation of the skin to the image is accomplished by a suitable threshold during this process. We know that different human skin colors can form clusters in the YCrCb space, which provides a basis for skin color segmentation. By creating a skin color model in the YCrCb space, the skin color segmentation is performed using two components, C b and C r . In this way, the YCrCb space can be obtained by linearly transforming from RGB space, which is simple and fast. Secondly, the Y component of the luminance information is removed. At the same time, only two components are included, and the calculation speed is also high. Counting the range of skin to separate the skin from the nonskin area, you get

3.3. Face Recognition Algorithm Based on Linear Subspace

3.3.1. pca face recognition method.

Assuming that there are a total of M images in the original image library as training samples, the normalized images are connected by n  ×  n columns to form a 2 n -dimensional column vector. Then the original face image vector is represented as X1, X2,…, XM, and the average of the total face image is

The K-L transform is used to calculate the covariance matrix, also known as the overall dispersion matrix, namely,

In order to find the eigenvalues of the n 2 ×  n 2 dimensional matrix C and the orthogonal normalized eigenvectors U , the calculation is too large if the calculation is too large, thus introducing the singular value decomposition theorem (SVD theorem) to solve the problem of high dimensionality. That is, the matrix R  = AT A ( M  ×  M dimensional matrix) is calculated first, and the orthogonal normalized eigenvectors V of R is calculated, and U and V have the following relationship:

3.3.2. Fisher Discriminant Analysis Face Recognition Method

(1) LDA Algorithm . Assuming that there are a total of N images in the original image library as training samples, the normalized image size is n  ×  n , and the columns are connected to form n 2 dimensional column vectors. Then the j th original face image vector of the i -th person is represented as X ij , where N i indicates that the N i face image belongs to the i -th class, and C is the sample class number. The average value of each type of face image is:

Defined according to the Fisher guidelines:

The optimum projection direction W is the value of W when the above formula reaches the maximum value. That is, W is the solution that satisfies the following equation:

That is, corresponding to the feature vector corresponding to the larger feature value of the matrix S-1WSb. Note that the matrix has a maximum of only C-1 non-zero eigenvalues, and C is the number of categories.

3.3.3. Zero Space Method

Direct linear discriminant analysis (D-LDA) first removes the null space of the interclass dispersion matrix S b , and finds a projection vector to minimize the intraclass dispersion, called D-LDA. The D-LDA method seems to avoid losing the zero space of S w . However, since the ranks of S b and S w have such a relationship: rank ( S b ) ≤ C − 1 ≤ rank ( S w ) ≤ N − C , removing the zero space of S b may result in the loss of part or all of the zero space of S w , which is likely to make S w full rank, that is, D-LDA indirectly loses S w . Zero space. The zero-space-based LDA first finds the null space of the intraclass dispersion matrix S w , and then projects the original sample onto the zero space of w S to maximize the interclass dispersion matrix S b . The optimal projection vector W should satisfy:

The optimal discriminant vector W should exist in the null space of w S .

3.4. Face Recognition Method Based on Kernel Method

3.4.1. kpca-based face recognition method.

First, the N training sample sets X1, X2, X3,…, XN in the original input space RN are nonlinearly mapped to the high-dimensional space by the polynomial kernel function to obtain the kernel matrix of the training set:

Then, the normalized kernel matrix K is calculated. Finally, the eigenvalues and eigenvectors of K are calculated, and the orthogonal eigenvectors corresponding to the largest m eigenvalues are u1, u2, u3,…, um. The sample after projection:

The above KPCA feature extraction is completed, and the sample Y in the high-dimensional space after the non-linear projection of the face is obtained, and input into the nearest neighbor classifier for classification and recognition.

3.4.2. KFDA-Based Face Recognition Method

After KPCA feature extraction, Y is a nonlinear mapping of faces to samples in high-dimensional space. The second feature extraction is performed on Y using the LDA algorithm. Calculate the best projection direction W according to the Fisher criterion function in the equation, and project Y to the optimal projection direction of the LDA.

The above KFDA feature extraction is completed, and K is extracted by KPCA and LDA features, and Z is input into the nearest neighbor classifier for classification and identification.

3.4.3. KFDA Face Recognition Algorithm Based on Zero Space (KFDA-NULL)

Set in the high-dimensional feature space, the interclass dispersion matrix and the intraclass dispersion matrix of the training samples are K b and K w , respectively, and the overall dispersion matrix is:

The total dispersion matrix K t in the feature space is calculated according to the input spatial data and the kernel function equation; then the eigenvalue analysis is carried out, and the transformation matrix P k is established by using the eigenvectors corresponding to all nonzero eigenvalues to obtain the intraclass dispersion after the feature space is reduced. Degree matrix and interclass dispersion matrix:

Enter any vector k j =[ k 1 j , k 2 j ,…, k Nj ] of the space, and formally project its projection into the high-dimensional feature space.

For large sample problems ( n  <  N ), S w is full rank and cannot extract any zero space. That is to say, in the case of large samples, any zero space based method fails. However, after nuclear mapping, the zero space based LDA can work on the core sample set. Therefore, for large sample problems, the kernel mapping method is an extension of the zero space method. Figure 1 is a flowchart of a face recognition system based on multi-feature fusion.

An external file that holds a picture, illustration, etc.
Object name is CIN2022-9224203.001.jpg

Flow chart of face recognition system based on multifeature fusion.

4. Experiment

4.1. data source.

The experiments in this paper are mainly based on the public ORL face database and Yale face database. The ORL face database is one of the most widely used face libraries. The database consists of facial expressions and details of black faces on different periods of time. It consists of 40 people, each with 10 front-facing face images of 112  ∗  92 size. Some of them were taken at different times, and the lighting conditions were almost unchanged, most of them were changes in expressions and postures. For example: laughing or not laughing, blinking or closing your eyes, with or without glasses; face posture changes, rotation up to 20 degrees; face size also has up to 10% change. The Yale face library consists of 15 people, each with 11 front-facing faces of 128  ∗  128 size, including different expressions, different lighting, blinking or closing eyes, and wearing and not wearing glasses.

The training set used by Deep ID is Celeb Faces. Deep ID Network Training during the use of this data set, 80% of the data in the training set was used to train the neural network part of the Deep ID network, while the latter Bayesian model was completed by the remaining 20%. Celeb Faces is a large dataset with a total of 87,628 images from 5,436 famous people in the Celeb Faces dataset. This data set is ideally suited for use as a training set and test set for computer vision tasks, and can perform a variety of functions including face detection, facial features, and face recognition.

4.2. Experimental Parameter Setting

The AdaBoost training sample library created in this paper has 3000 face samples of 24 × 24 size, including nearly 700 multi-pose face samples with obvious deflection tilt. In the experiment, although the face samples are as much as possible, the face samples will be more good detection results, but at the same time will increase the training burden, so this paper selected 3000 face samples according to the previous detection system. As for the selection of 700 multi-pose faces, it has been better through experiments. Detection, but subsequent algorithm optimization needs to be thoroughly studied and compared. Although the use of skin color segmentation has excluded a large number of nonface background areas, in order to accurately detect the face, a large number of nonface samples are needed for training, so the “bootstrap” method is used to obtain the 5000 background images collected. A large number of non-face samples. The minimum detection rate m Din of the strong classifier is set to 0.999, the maximum false detection rate Fmax is 0.5, and the maximum number of training layers is 15.

5. Results and Discussions

5.1. analysis of single face recognition results based on skin color segmentation.

The test results of the single face test set are shown in Table 1 :

Single face test set test results.

According to the experimental test results and statistical analysis of the data in Table 1 :

The skin color feature is combined with the AdaBoost algorithm to eliminate the complex background of the human face under the gray image, which effectively reduces the false detection rate. However, we also find that the detection rate does not improve after the combination, but has a slight influence. By observing the test results, it was found that the detected image was severely affected by light and other factors, which led to misdetection or damage detected by the AdaBoost method.

As can be seen in Figure 2 , the introduction of skin color features into the AdaBoost algorithm is a good limitation of the number of false detection windows, but due to the leak detection of the skin color detection in the test image under weak lighting conditions, it is excluded that the light can be processed by illumination. The face detected by the AdaBoost method, so the skin color + AdaBoost method has more missed faces than the AdaBoost method, but in general, the introduction of skin color features greatly reduces the number of false detection windows, making the skin color + AdaBoost method in ROC The curve performance is better than the AdaBoost method. On this basis, although the method of this paper will also identify some weakly illuminated faces due to the introduction of skin color leakage, but because of the new sparse features instead of Haar features, the AdaBoost method has a better multi-pose face detection effect. Improvement, so that the overall detection efficiency of the algorithm is improved, but also for this reason, the false detection window of this method will be slightly more than the skin color + AdaBoost method.

An external file that holds a picture, illustration, etc.
Object name is CIN2022-9224203.002.jpg

Single face test set ROC curve.

5.2. Analysis of Multi-Face Recognition Results Based on Skin Color Segmentation

The test results of the multi-face test set are shown in Table 2 :

Multi-face test set test results.

By observing Table 2 and the experimental data, we can get the conclusions roughly as shown in Table 1 , but compared with Table 1 , the false detection rate of the three methods is increased overall, and the detection rate is decreased. This is due to the multi-face image. The background may be more complicated, the face pose is more varied, even occluded and blurred, so the detection effect will be worse than the single face detection. But overall, the skin color + AdaBoost method of this article is still relatively better than the first two methods. Figure 3 shows the ROC curve for a self-built multi-face test set.

An external file that holds a picture, illustration, etc.
Object name is CIN2022-9224203.003.jpg

Self-built multi-face test set ROC curve.

From Figure 3 , we can get similar conclusions as shown in Figure 2 , but because the background area in the multi-face image is more complicated and the face pose is more diverse, the test performance of the three methods is generally reduced. Overall, the method is still better. In the first two methods, the skin color feature is very good at limiting the false detection rate, and the sparse feature is better for detecting multi-pose faces.

5.3. Analysis of Face Recognition Results Based on Linear Subspace

The ORL library consists of 40 people. Each person has 10 different face images. The first 5 images of each person are used as the training set, and the last 5 images are used as test sets. The PCA method is used to select different feature space dimensions and different. The number of samples and the calculation recognition rate are shown in Table 3 .

Corresponding recognition rates of different feature sub-space dimensions and number of different training samples in PCA.

Where d is the dimension of the selected sub-space, n is the number of selected training samples (3 means 10 out of 10 images for one person, 7 for training), using the most recent neighbor classification. From the experimental data in the table, it is found that as the dimension of the sub-space increases, the recognition rate also increases accordingly. When the dimension of the subspace is small, the increase in the recognition rate is significant. At the same time, the number of training samples also has a great influence on the relationship between the sub-space dimension and the recognition rate. Select the same sub-space dimension, the more training samples, the higher the recognition rate. Therefore, the more samples used for training, the more adequate the training and the better the recognition. Of course, here we have to avoid the situation of training.

When the number of training samples is fixed, the higher the dimension of the sub-space, the higher the recognition rate. Observing a large amount of experimental data, we can see that the maximum value of recognition rate is about d  = 71 when n  = 7; the maximum value of recognition rate when n  = 5 is about d  = 80; when n  = 3, the recognition rate The maximum point is around d  = 90. After that, the sub-space dimension is increased and the recognition rate will not exceed this point. It can be seen that since the PCA method is based on gray scale statistics, some feature vectors may add invalid information such as noise, resulting in a decrease in the recognition rate. When the number of training samples per person is fixed at 5, the ORL face database increases with the dimension of the sub-space, and the changes of the threshold and recognition rate are shown in Figure 4 .

An external file that holds a picture, illustration, etc.
Object name is CIN2022-9224203.004.jpg

PCA recognition rate changes with threshold.

When the dimension of the feature sub-space increases, the threshold of the feature value increases, and the recognition rate also increases. When the threshold is 0.65, the corresponding dimension is 12, and the recognition rate tends to be balanced. When the threshold is 0.92, the dimension is 80 and the recognition rate is the highest. In practical applications, the sub-space composed of the feature vectors corresponding to the feature values of 0.8∼0.9 of the overall feature value is generally used for PCA face recognition.

5.4. Face Recognition Method Based on Kernel Method

There are 40 people in the ORL, the first 5 images of each person are used for training, the last 5 are used for testing, the nearest neighbor method is used as the classifier, and different kernel functions and corresponding parameters are selected. The recognition results are shown in Table 4 . Table 5 shows the recognition results of different recognition methods in the three databases.

Identification results when selecting different kernel functions and corresponding parameters on the ORL face database.

Recognition results of different recognition methods in three databases.

If you choose the polynomial kernel function, you can see from Figure 5 that when d  = 0.8, KPCA has a higher recognition ability. Therefore, for KPCA, a polynomial kernel function with a small exponent (between 0 and 1) can achieve better recognition. However, for KFDA and zero-space-based KFDA (KFDA + NULL), the recognition rate is the highest when d  = 2, and KPCA also has a high recognition rate. When the value of d is from 0 to 2, the recognition rate of KPCA decreases. KFDA uses LDA for secondary feature extraction based on KPCA feature extraction. When KPCA recognition rate is quite high, it is equal to or close to 100. In the case of %, the secondary feature extraction using LDA will reduce the recognition rate; when the recognition rate of KPCA is relatively low, that is, KPCA cannot extract the discrimination information well, and the secondary feature extraction by LDA can effectively extract the discrimination information, so the recognition rate has increased. When d  = 2, the recognition rate of KFDA and zero space-based KFDA is the largest. For polynomial functions, in general, d  = 2.

An external file that holds a picture, illustration, etc.
Object name is CIN2022-9224203.005.jpg

Comparison of several face recognition methods on ORL face database (using polynomial kernel function).

If the RBF kernel function is selected, σ 2 = 5 × 106, the above KPCA, KFDA and zero space-based KFDA three face recognition algorithms based on the kernel method have higher recognition ability. From the experimental data in Table 4 , it can be concluded that the face recognition algorithm based on the kernel method has good recognition performance when the RBF kernel function is selected and the parameter is set to σ 2 = 5 × 106. Figure 6 is a comparison diagram of four face recognition methods based on ICA.

An external file that holds a picture, illustration, etc.
Object name is CIN2022-9224203.006.jpg

Comparison of four face recognition methods based on ICA.

The face is detected by the AdaBoost method, so the skin color + AdaBoost method has more missed faces than the AdaBoost method, but in general, the introduction of the skin color feature greatly reduces the number of false detection windows, making the skin color in the ROC The curve performance of +AdaBoost method is better than that of AdaBoost method. As the dimension of the feature sub-space increases, the threshold of the feature value increases, and the recognition rate also increases. When the threshold is 0.65, the corresponding dimension is 12, and the recognition rate tends to be balanced. When the threshold is 0.92, the dimension is 80, and the recognition rate is the highest. In practical applications, PCA face recognition usually uses a sub-space composed of eigenvectors corresponding to eigenvalues of 0.8 to 0.9 of the total eigenvalues. The three face recognition algorithms based on kernel method, KPCA, KFDA and KFDA based on null space, have high recognition ability. When the RBF kernel function is selected and the parameter is set to σ 2 = 5 × 106, the face recognition algorithm based on the kernel method has good recognition performance.

6. Conclusions

In order to identify and detect human faces through computer vision technology, this paper studies the algorithm and draws the following conclusions:

  • In general, the method used in this paper has improved compared with the traditional AdaBoost method and skin color + AdaBoost method. The introduction of the previous experiment and analysis of skin color features can better eliminate the complex background of non-face. AdaBoost detection is performed compared to directly using grayscale images, which reduces the probability of false detection. In addition, the new sparse features are used to replace the Haar features in the traditional AdaBoost algorithm, so that the system can better cope with the traditional AdaBoost method. Multi-pose face such as deflection tilt effectively reduces the face detection and improves the detection rate, but skin color features and sparse features improve the performance of the system at the same time.
  • Because the self-built face training sample has added the face collected by the laboratory itself, there is a high detection rate in the test. The experiment compares the AdaBoost method, the skin color method and the experimental results of the skin color + AdaBoost method. It can be seen that the method is better than the AdaBoost method in terms of detection rate and false detection rate. At the same time, it is also found that due to the addition of skin color features, it is possible to eliminate the face detected by AdaBoost due to illumination, etc., and the AdaBoost method using a new sparse feature for detecting the face of more gesture modes. It is more likely to cause false detections of complex backgrounds, but also because of the use of skin color features, these false detections are limited to the area of the skin color (such as the hand) background.
  • All operations in the KPCA and KFDA algorithms are performed by the inner product kernel function defined in the original space, and no specific non-linear mapping function is involved. This is the core skill of the nuclear learning method. Zero-space-based KFDA overcomes the effects of illumination and is robust to expression and attitude changes. The zero-space method can overcome the small sample problem in discriminant analysis by finding the best discriminant analysis information existing in the zero space of the interclass dispersion matrix. Combining the zero-space method kernel discriminant analysis method not only improves the ability of discriminant analysis to extract non-linear features, but also overcomes the small sample problem in discriminant analysis.
  • Through the secondary extraction of PCA features, a better recognition result than the PCA method is obtained. These two classical algorithms can be described by the same framework, that is, the corresponding linear feature space is constructed first, then the image is projected to the linear space, and the obtained projection coefficient is used as the identified feature vector. The only difference between the two methods is that the feature space is chosen differently. Aiming at the small sample problem of two linear sub-space methods, PCA and LDA, this paper also proposes a zero-space based Fisher discriminant analysis method. Experiments show that the zero-space-based method makes full use of the zero space in the intraclass dispersion matrix. The useful discriminating information improves the correct rate of face recognition to some extent.

Acknowledgments

This paper was funded by the National Natural Science Foundation of China under Grant no. 51679057.

Data Availability

Conflicts of interest.

The authors declare that there are no conflicts of interest regarding the publication of this article.

ORIGINAL RESEARCH article

Research on face recognition and privacy in china—based on social cognition and cultural psychology.

\r\nTao Liu*

  • Department of Sociology, Hangzhou Dianzi University, Hangzhou, China

With the development of big data technology, the privacy concerns of face recognition have become the most critical social issue in the era of information sharing. Based on the perceived ease of use, perceived usefulness, social cognition, and cross-cultural aspects, this study analyses the privacy of face recognition and influencing factors. The study collected 518 questionnaires through the Internet, SPSS 25.0 was used to analyze the questionnaire data as well as evaluate the reliability of the data, and Cronbach’s alpha (α coefficient) was used to measure the data in this study. Our findings demonstrate that when users perceive the risk of their private information being disclosed through face recognition, they have greater privacy concerns. However, most users will still choose to provide personal information in exchange for the services and applications they need. Trust in technology and platforms can reduce users’ intention to put up guards against them. Users believe that face recognition platforms can create secure conditions for the use of face recognition technology, thus exhibiting a higher tendency to use such technology. Although perceived ease of use has no significant positive impact on the actual use of face recognition due to other external factors, such as accuracy and technology maturity, perceived usefulness still has a significant positive impact on the actual use of face recognition. These results enrich the literature on the application behavior of face recognition and play an important role in making better use of face recognition by social individuals, which not only facilitates their daily life but also does not disclose personal privacy information.

Introduction

Face recognition is a biometric recognition technology that uses pattern matching to recognize individual identities based on facial feature data. Compared to traditional non-biological recognition and physiological feature recognition technology, face recognition technology has specific technical advantage ( Jiang, 2019 ). Nowadays, relying on ubiquitous mobile camera devices, face recognition technology has been widely used in various fields, including face attendance, face payment, smart campus, access control system, and security system, which demonstrate the advances in the face recognition service level in the intelligent hardware system. Face recognition technology has dramatically improved the intelligence level of business systems in these fields. The human face is rich in features. In the society of acquaintances in the past, the face was the foundation for us to involve emotional communication and social relations with others.

Technology has been one of the most important factors that changed the way of life and commercial activities of human society. With continuous innovation and the development of technology, human society is changing rapidly. Technological innovation has changed people’s lifestyles in the spheres of shopping, education, medical services, business organizations, and so on. “Technology is not only an essential tool for finding out new ways to join different actors in service innovation processes, but also as an element able to foster the emergence of new and ongoing innovations” ( Ciasullo et al., 2017 ). For example, in the healthcare service ecosystem, health care providers adapt to the innovative medical service ecosystem so that patients can obtain better medical services. Medical service innovation has had a great impact on the continuous reconstruction of the service ecosystem ( Ciasullo et al., 2017 ). Technology forces the market to change constantly, and the changing market leads business organizations to innovate. “The contemporary world is characterized by a fast changing environment. Business organizations are faced with the challenge of keeping pace with developments in the field of technology, markets, cultural and socio-economic structures” ( Kaur et al., 2019 ). In the era of big data and information, business organizations must “to explore how cognitive computing technology can act as potential enabler of knowledge integration-based collaborations with global strategic partnerships as a special case” ( Kaur et al., 2019 ).

At present, innovations in network technology provide the greatest convenience and advantages for organizations dealing with such networks. “Small and medium-sized enterprises (SMEs) have been considered the most innovative oriented businesses in developed countries even in emerging markets acting as pioneer in the digital transformational word.” Meanwhile, it is important for technology upgrading, knowledge spillover, and technology transfer to explore SMEs’ competitiveness ( Del Giudice et al., 2019 ).

Knowledge and technology transfer is a “pathway” for accelerating economic system growth and advancement. Technology transfer can be explored from theory to practice for knowledge and technology. From the users’ perspective, technology transfer affects their sense of use and experience ( Elias et al., 2017 ). Big data analytics capabilities (BDAC) represent critical tools for business competitiveness in highly dynamic markets. BDAC has both direct and indirect positive effects on business model innovation (BMI), and they influence strategic company logics and objectives ( Ciampi et al., 2021 ). “In the world of Big Data, innovation, technology transfer, collaborative approaches, and the contribution of human resources have a direct impact on a company’s economic performance.” Therefore, big data companies should make corresponding changes in management and strategy. Moreover, skilled human resources have a positive contribution to the company’s economic performance. “Information and knowledge are the foundation on which act for aligning company’s strategies to market expectations and needs” ( Caputo et al., 2020 ).

With the arrival of the era of artificial intelligence, intelligent social life has become a reality, and artificial intelligence has become a new engine for China’s economic and social development. According to the latest data released by the China Internet Information Center, the number of artificial intelligence enterprises in China ranks second in the world ( CNNIC, 2020 ). As a new technology, face recognition—a typical application of artificial intelligence—rises with the construction of a smart city According to the statistics presented in the Report on In-depth Market Research and Future Development Trend of China’s Face Recognition Industry (2018–2024) released by the Intelligence Research Group, it is estimated that the face recognition industry in China will reach 5.316 billion Yuan by 2021 ( Biometrics Identity Standardization [BIS], 2020 ). As the gateway connecting humans and intelligence face recognition has excellent development potential.

Given that the modern era emphasizes looks, the face remains socially functional, but technology has given it new meaning and a mission. The attributes and features of a facial image are enough to convey a person’s identity. When our face is tied to our personal information and even used as a password substitute, it is no longer the traditional concept of face. Face recognition technology can extract personally identifiable information, such as age, gender, and race, from images. To some extent, in the Internet age, almost everyone’s personal information is displayed without any protection.

With the technical support of big data, user portraits based on facial recognition and a variety of personal data have increasingly become identification for individuals in this day and age ( Guo, 2020 ). From face-swapping apps, access by face recognition to Hangzhou Safari Park, the application of face recognition in subway security checks, to the formulation of the Personal Information Protection Law of the People’s Republic of China (PRC), a series of public opinions have brought face recognition to the forefront. On the other hand, Internet privacy, which has been neglected so far, is increasingly taken seriously by the public.

The issues of face recognition and privacy have been studied extensively by experts and scholars in their respective fields, but there are few empirical studies on the combination of the use of face recognition and personal privacy security. At present, most scholars’ research on face recognition focuses on face recognition algorithms, recognition systems, legal supervision and security, users’ willingness to accept face payments, and the application of face recognition in the library. No quantitative research has been conducted on the relationship between the use of face recognition technology and people’s attitudes toward privacy issues. Therefore, based on the two main determinants of the technology acceptance model (TAM) and according to public attitudes toward privacy and the specific context of the use of face recognition in the current networked environment, variables such as privacy concerns, risk perception, and trust are introduced in this study to build the hypothesis model of the actual use of face recognition. The concept of privacy concerns is applied to the research on personal information security behavior of facial recognition users, which further expands the practical scope of the privacy theory and provides suggestions to promote the development of facial recognition applications.

This research makes two contributions. First, it demonstrates the impact of privacy concerns, perceived risk, trust, social cognition, and cross-cultural aspects on facial recognition. This result enriches face recognition literature, and a hypothesis model based on perceived ease of use and perceived usefulness—the two determinants of user behavior—is created. Second, this research confirms that the privacy paradox still exists. In the digital information age, most users will still choose to provide personal information in exchange for the services and applications they need. Trust, social cognition, and culture play a vital role in intelligent societies and virtual interactions. Meanwhile, when technology applications can provide users with diversified and user-friendly functions, their perceived usefulness is significantly improved.

The structure of the article is as follows. In section “Theoretical Basis and Research Hypothesis,” we examine the theoretical basis and research hypothesis. Section “Variable Measurement and Data Collection” describes variable measurement and data collection, including questionnaire design and data collection. Section “Data Analysis” presents the results of the data analysis. Section “Conclusion” discusses the key findings of the research along with the final remarks.

Theoretical Basis and Research Hypothesis

In the era of mobile data services based on big data, “the nature of economic exchange is more inclined to exchange personal information for personalized services. Privacy violations may occur in the acquisition, storage, use and transaction of personal information, thus giving rise to problems in information privacy” ( Chen and Cliquet, 2020 ). Moreover, in the Internet environment, information privacy security in intelligent society is increasingly threatened. Since facial recognition is based on the acquisition of human face image information and face information demands privacy, face information security becomes the focus of the public when choosing whether to use facial recognition technology. On the one hand, human faces are rich in features, which provide powerful biometric features for identifying individuals; thus, a third party can identify individuals through face positioning, and so it is necessary to prevent the malicious collection and abuse of such information. On the other hand, through image storage and feature extraction, a variety of demographic and private information can be obtained, such as facial age, health status, and even relatives, which leads to unnecessary privacy invasion ( Zahid and Ajita, 2017 ). Therefore, in view of the uniqueness of human face and information privacy, the focus of this paper will be whether the public’s actual use of face recognition is affected by their attitudes toward personal privacy and the perceived risk of personal data.

Privacy Concerns

Privacy concerns are widely used to explain the behavior intention of users ( Zhang and Li, 2018 ). In the Internet field, privacy concerns of users include people’s perceptions and concerns about improper access, illegal acquisition, illegal analysis, illegal monitoring, illegal transmission, illegal storage, and illegal use of private information ( Wang et al., 1998 ). Users do not have full control over the use of their personal information. Thus, users become concerned about privacy when it may be violated due to security loopholes or inappropriate use or when individuals perceive the risk of privacy infringement.

Personal privacy in the age of mobile data services involves both online and offline domains. The extensive use of various personal biological information applications poses new challenges to personal privacy security. Specifically, with the progress of computer algorithms, the Internet of Things, and other technologies, the threshold of information collection becomes increasingly lower, and computerized information may be easily copied and shared, resulting in problems such as secondary data mining and inadequate privacy ( Qi and Li, 2018 ). In the existing research on privacy concerns, Cha found that there is a negative correlation between users’ concerns regarding the information privacy of a technology-driven platform and the frequency of users using the media ( Cha, 2010 ). McKnight conducted research on Facebook, whereby they found that the greater the concern about privacy is in a medium, the less willing people are to continue using the medium for fear of personal information being abused ( McKnight et al., 2011 ). In the context of big data, the privacy concerns of face recognition users originate from the risk of facial image information being collected and used without personal knowledge or consent or the risk of personal biometrics being transmitted or leaked. In other words, the cautious choice of face recognition application is influenced by the extent of individual concerns regarding privacy. Considering these notions, the following hypothesis is proposed:

Hypothesis 1: Privacy concerns have a negative impact on the actual use of face recognition.

Perceived Risk

Due to the virtuality or uncertainty of a network, perceived risk is an individual’s perception of the risk of information breach. The perceived risk of facial recognition may arise from the disclosure or improper use of face information. Chen conducted an empirical study on this and believed that the degree of individuals’ concerns for information security is affected by the perceived network risk ( Chen, 2013 ). Norberg et al. (2007) showed in their study that the negative effect of perceived disclosure is affected by perceived risk. In other words, the more users perceive that the disclosure of personal information will lead to the illegal breach of privacy and other adverse effects, the more they will be concerned about the security of personal privacy. Not only is the degree of privacy concerns positively affected by perceived risk, but studies have also shown that perceived risk also affects actual use behavior ( Zhang and Li, 2018 ). Hichang’s (2010) research results show that the degree of severity of privacy risks perceived by users is positively correlated with the degree of their self-protection behaviors. When people realize that their personal information is at risk, they take active preventive actions. Therefore, in this paper, regarding the intention to use facial recognition, it is believed that the higher the risk perceived by users, the more users will pay attention to the breach of personal privacy, thus affecting the actual use of facial recognition. In this vein, the following hypotheses are proposed:

Hypothesis 2: Perceived risk has a positive effect on privacy concerns.

Hypothesis 3: Perceived risk has a negative influence on the actual use of face recognition.

Trust Theory

Simmel (2002) pioneered the sociological study of trust, believing that trust is an essential comprehensive social force. Putnam (2001) believed that trust is an essential social capital and can improve social efficiency through actions that promote coordination and communication. In an intelligent social environment, social transactions cannot occur without trust. Hence, trust has also become an essential factor in the study of privacy issues. In the context of face recognition, trust is defined as users’ belief in the ability of face recognition technology and application platforms to protect their personal information. Joinson et al. (2010) found in his study that users’ perceived risk to personal privacy is affected by their degree of trust. Moreover, through research on the behavioral intention of intelligent media use, some scholars present that trust will directly affect the use intention, and there is a significant correlation between trust and users’ use intention. Therefore, the following hypotheses are proposed:

Hypothesis 4: Trust negatively affects the perceived risk of users with face recognition.

Hypothesis 5: Trust positively affects the actual use of face recognition.

Technology Acceptance Model

The TAM is widely used to explain users “acceptance of new technologies and products, and it is the most influential and commonly used theory to describe individuals” degree of acceptance to information systems ( Lee et al., 2003 ). The TAM is used for research in different fields: education ( Scherer et al., 2019 ), hospitals and healthcare ( Nasir and Yurder, 2015 ; Fletcher-Brown et al., 2020 ; Hsieh and Lai, 2020 ; Papa et al., 2020 ), sports and fitness ( Lunney et al., 2016 ; Lee and Lee, 2018 ; Reyes-Mercado, 2018 ), fashion ( Turhan, 2013 ; Chuah et al., 2016 ), consumer behavior ( Wang and Sun, 2016 ; Yang et al., 2016 ; Kalantari and Rauschnabel, 2018 ), gender and knowledge sharing ( Nguyen and Malik, 2021 ), wearable devices ( Magni et al., 2021 ), human resource management ( Del Giudice et al., 2021 ), Internet of Things ( Caputo et al., 2018 ), technophobia and emotional intelligence influence on technology acceptance ( Khasawneh, 2018 ).

In this study, a hypothesis model is developed based on perceived ease of use and perceived usefulness, two determinants of user behavior.

Perceived usefulness refers to the extent to which users believe that using a specific system will improve their job performance. Perceived ease of use refers to the ease with which users think a particular system can be used, which also affects their perceived usefulness of technology ( Davis, 1989 ). The easier it is to use face recognition, the more useful it is considered be. For the purpose of this study, face recognition aims to realize multiple functions, such as providing efficient and convenient services. Therefore, the definition of perceived usefulness should be extended to users think face recognition can improve the degree of convenience and service. In this paper, the ease of using a face recognition application refers to users’ perceived ease of use of the technology. Previously, Davis (1989) conducted an empirical study on the e-mail system and concluded that perceived ease of use has a positive impact on the use of applications. In a study on the adoption and use of information systems in the workplace, Venkatesh and Davis (2000) demonstrated that perceived usefulness has a positive impact on people’s usage behavior. With the extensive application of the TAM in the information system, the face recognition technology studied in this paper also comprises intelligent media. Perceived usefulness is an important variable that affects the use of face recognition. Thus, the following hypotheses are proposed:

Hypothesis 6: Perceived ease of use has a positive impact on perceived usefulness.

Hypothesis 7: Perceived ease of use has a positive influence on the actual use of face recognition.

Hypothesis 8: Perceived usefulness has a positive impact on the actual use of face recognition.

The research model of this paper is shown in Figure 1 .

www.frontiersin.org

Figure 1. Structural equation model.

Variable Measurement and Data Collection

Questionnaire design.

In order to ensure the scientificity and credibility of the measurement variables, this study modified the mature scale in previous studies and combined it with the information concerns of current users on the use of face recognition and developed a questionnaire. This questionnaire consists of two parts. The first part investigates the demographic characteristics of users, such as gender and age. The second part is measured by a Likert scale. The options of each measurement item include “Strongly disagree,” “Disagree,” “Neither agree nor disagree,” “Agree,” and “Strongly agree.” The survey included seven latent variables and 21 measured variables. Latent variables included perceived ease of use, perceived usefulness, privacy concerns, risk perception, trust, and actual use. The contents of the scale are shown in Table 1 .

www.frontiersin.org

Table 1. Design of measurement items for variables studied.

Data Collection

In this study, the questionnaire was designed on the survey platform 1 and distributed in the form of links through WeChat, QQ, and other channels. The survey was conducted from May 26 to June 10, 2020, and a total of 635 questionnaires were recovered. The subjects of the questionnaire were users of face recognition technology. After the second screening, 518 valid questionnaires remained after the elimination of incomplete questionnaires and all the questionnaires with the same options. The specific statistics are shown in Table 2 .

www.frontiersin.org

Table 2. Statistical analysis of demographic characteristics ( N = 518).

From the reported statistics, it can be seen that the gender ratio of the sample data is balanced. The age structure of the respondents is mainly between 18 and 35 years old, so it is an overall young sample, conforming to the age characteristics of the main user group of facial recognition. The respondents mostly have a high level of education, with a bachelor’s degree or above. In terms of urban distribution, 58.7% of respondents came from first-tier cities and new first-tier cities. The sample coverage is reasonable and thus representative. As for privacy, more than 86.1% of respondents believe that face information is private. Consequently, the sample data collected in this questionnaire applies to the relevant research on the privacy problems of face recognition users.

Data Analysis

Reliability and validity analysis.

For this study, SPSS 25.0 was used to analyze the collected data and evaluate the reliability of the data. Cronbach’s alpha (α coefficient) was used to measure the data in this study. With 0.7 as the critical value, it is generally believed that when Cronbach’s α coefficient is greater than 0.7, the scale has considerable reliability. Based on the test results, the overall Cronbach’s α coefficients of privacy concerns, perceived risk, perceived ease of use, perceived use, trust, and actual use are between 0.876 and 0.907, all of which are greater than 0.7. This indicates that the measurement of each latent variable shows excellent internal consistency and that the questionnaire is reliable as a whole.

Structural validity refers to the corresponding relationship between measurement dimensions and measurement items. It is often used in research to analyze questionnaire items. According to the results of AMOS 24.0 for confirmatory factor analysis, the fitting index of confirmatory factor analysis in this study was X2/df = 2.722, which is less than 3, thus indicating that the fit was ideal. RMSEA = 0.058, which is less than 0.08, indicating that the model is acceptable. It is generally believed that when the fitting index of NFI, IFI, and CFI is greater than 0.9, it indicates that the model fits well; in this regard, NFI = 0.938, RFI = 0.925, IFI = 0.960, TLI = 0.951, CFI = 0.959. Therefore, the fitting index of this model conforms to the common standard, and the fitting degree of the model is proper.

Exploratory factor analysis is utilized to determine whether each measurement item converges to the corresponding factor, and the number of selected factors is determined by the number of factors whose eigenvalue exceeds 1. If the value of factor loading is greater than 0.6, it is generally considered that each latent variable corresponds to a representative subject ( Gerbing and Anderson, 1988 ; Gefen and Straub, 2005 ).

As shown in Table 3 , the values of factor loading of the latent variables, including privacy concerns, perceived risk, perceived ease of use, perceived usefulness, trust, and actual use, were all greater than 0.7, which shows that the corresponding topic of latent variables is highly representative.

www.frontiersin.org

Table 3. Factor load and variable combination reliability.

Combined reliability (CR) and average variance extracted (AVE) were used for the convergent validity analysis. Generally, the recommended threshold of CR is greater than 0.8 or higher ( Werts et al., 1974 ; Nunnally and Bernstein, 1994 ). AVE is recommended to be above 0.5 ( Fornell and Larcker, 1981 ). As shown in Table 3 , the AVE of each latent variable was greater than 0.5, and CR was greater than 0.8, indicating that the convergence validity was ideal.

According to the results in Table 4 , there was a significant correlation between actual use and privacy concerns, perceived risk, perceived ease of use, perceived usefulness, and trust ( p < 0.001). In addition, the absolute value of the correlation coefficient corresponding to each variable was less than 0.5 and was less than the corresponding AVE square root. It indicates that there was a specific correlation between latent variables and a certain degree of differentiation among them, so the scale has an ideal level of discriminant validity.

www.frontiersin.org

Table 4. Correlation coefficient and AVE square root.

Correlation Analysis

Correlation analysis studies whether there is a correlation between variables and uses the correlation coefficient to measure the degree of closeness between variables. The three statistical correlation coefficients are the Pearson correlation coefficient, the Spearman correlation coefficient, and the Kendall correlation coefficient, of which the Pearson correlation coefficient is commonly used in questionnaire and scale studies ( Qi and Li, 2018 ). In this study, SPSS 25.0 and Pearson’s correlation analysis were used to study whether there is a significant correlation between privacy concerns, perceived risk, perceived ease of use, perceived use, trust, and actual use in a hypothetical model to validate the validity of the research hypotheses.

Table 5 shows the means and standard deviations of privacy concerns, perceived risk, perceived ease of use, perceived usefulness, trust, and actual use and the Pearson correlation coefficient between the variables. From the mean, users had a higher perceived risk and a lower degree of trust. The results of correlation coefficient matrix showed that perceived risk and privacy concerns are significantly and positively correlated, and H2 was initially verified; privacy concerns, persistent risk, and actual use were negatively correlated ( r = –0.158, p < 0.01), and the correlation degree was weak, preliminarily supporting H1 and H3. There was a positive correlation between perceived ease of use, perceived usefulness, and actual use ( p < 0.01). Among these, perceived ease of use had a weak correlation with actual use ( r = 0.292) and perceived usefulness showed a moderate correlation with actual use ( r = 0.494); thus, H6, H7, and H8 were preliminarily verified. There was a significantly strong correlation between trust and actual use ( p < 0.01, r = 0.608), so H5 was preliminarily verified. In addition, trust was also negatively correlated with perceived risk, due to which H4 was preliminarily verified.

www.frontiersin.org

Table 5. Correlation coefficient matrix and mean and standard deviation of variables.

Path Analysis and Hypothesis Testing

The correlation analysis results showed that there was a correlation between the variables, so these hypotheses were preliminarily supported. Nevertheless, it could not adequately explain the systematic relationship between variables. Thus, AMOS 24.0 and the structural equation model were further employed in this study to explore the systematic relationship between the variables. As shown in Figure 2 .

www.frontiersin.org

Figure 2. Path analysis diagram of the structural equation model.

As can be seen from Table 6 , the ratio of chi-square to the degree of freedom in the structural equation was less than 5, which is within the acceptable range. RFI, CFI, NFI, TLI, IFI, and GFI indexes were all significantly greater than 0.9, and the root mean square error of approximation (RMSEA) was less than 0.08. Thus, it shows that the structural equation model fits well.

www.frontiersin.org

Table 6. Fitting of the structural equation model ( N = 518).

According to Table 7 , the hypotheses H2, H4, H5, H6, and H8 were verified, which shows that trust and perceived usefulness both positively influence the actual use intentions of face recognition users and that perceived risk also has a significant positive impact on privacy concerns. This indicates that the higher the public’s awareness of privacy is, the more risks it will perceive and the higher the public’s concerns about privacy will be. However, H1 and H3 were not accepted. From the test results, it can be seen that privacy concerns and perceived risk had a negative influence on the actual use of face recognition, but the influence was not significant. In addition, H7 was not supported, indicating that perceived ease of use had no significant influence on the actual use of face recognition.

www.frontiersin.org

Table 7. Results of the hypothesis test.

Hypotheses H1, H3, and H7 were not supported for the following reasons:

1. H1 and H3 were not supported: Perceived risk and privacy concerns had no significant adverse effect on the actual use of face recognition. It shows that the public chooses to use face recognition despite their concern and perception of privacy and risk. Some scholars have called this contradictory phenomenon a privacy paradox ( Xue et al., 2016 ). In other words, although users are worried that face recognition may lead to improper use or disclosure of personal information, they still choose to use face recognition in the field of mobile networks. An important reason is that the application of intelligent media technology, facial recognition, is becoming increasingly prevalent in our daily lives, which is reflected in all aspects of our lives. Especially in the field of public services, relying on the digital platform has improved effectiveness and efficiency via face scanning.

2. H7 was not supported: The positive influence of perceived ease of use on the actual use of face recognition was not significant. This conclusion is not consistent with previous research, but to some extent, it confirms the correlation between perceived ease of use and the use of information systems. In other words, since ease of use involves self-efficacy cognition, technology anxiety can make users perceive it to be difficult to operate and reduce their evaluation of the ease of use of the system, thus further affecting the use of face recognition technology ( Bhattacherjee, 2001 ). Affected by external factors such as light and image clarity, the maturity of face recognition technology is not high, and the algorithm is not accurate, which affects the public’s perceived ease of use. It also reflects that, for the face recognition technology, perceived usefulness has a more substantial impact on the actual use, and those users value the functional benefits brought by face recognition applications.

Robustness Test of the Model

In this paper, gender, age, educational background, and city of the respondents were introduced into the model as control variables to test the robustness of the hypothesis model. The test results are shown in the figure below.

It can be seen from Figure 3 that despite introducing control variables, such as gender, age, education background, and city, the relationship and significance level of each factor of the model were consistent with the conclusion of hypothesis test results above. Meanwhile, the test results of the influence of each control variable on the actual use of face recognition were not significant, indicating that the model passed the robustness test.

www.frontiersin.org

Figure 3. Robustness test.

In this study, taking the users of face recognition as the research objects, the TAM was integrated, and variables such as privacy concerns, perceived risk, and trust were added to the model to analyze the mechanism of how they affect the actual use of face recognition and explain the determinants for the use of facial recognition by the public. The results showed that the model fit well and that most of the hypotheses were supported.

Based on the results of the model analysis, this paper draws the following conclusions:

1. In the context of big data, the concept of information privacy has been continuously expanded. When users perceive the risk of their private information being disclosed through face recognition, they will have greater privacy concerns. However, although users’ privacy concerns are deep, the privacy paradox still exists. In the digital information age, most users will still choose to provide personal information in exchange for the services and applications they need.

2. Trust plays a vital role in intelligent societies and virtual interactions. In this paper, users’ trust in face recognition applications includes trust in the technology application platforms and trust in the face recognition technology itself. This study shows that the trust of technology and platform will reduce the user’s intention to safeguard themselves against it. Users believe that face recognition platforms can provide secure conditions for the use of the technology, and thus, they show a higher tendency to use such technology. On the other hand, users’ trust in face recognition technology improves, so their perceived risk of privacy information leakage is significantly reduced. In this regard, in the information age, users are willing to disclose personal information more out of their trust in face recognition technology and the related platforms.

3. In the context of face recognition as an emerging technique, the TAM still has excellent explanatory power. Although perceived ease of use has no significant positive impact on the actual use of face recognition due to other external factors, such as accuracy and technology maturity, perceived usefulness still has a significantly positive impact on the actual use of face recognition. To an extent, when technology applications can provide users with diversified and user-friendly functions, their perceived usefulness will be significantly improved.

4. The final consideration is the use management of government and technical ethics of enterprises. When developing face recognition, enterprises must pay attention to technical ethics, as well as privacy, to ensure personal privacy and protect against biological information leakage. The government must also strengthen its management of face recognition technology on a large scale to prevent enterprises and individuals from using technology to affect social security and personal privacy.

Limitations

There are some limitations to this study. First, the sample data in the model are mostly from a young group. In future research, survey data of other age groups can be explored to discuss whether the privacy concerns of users of different age groups will affect their use of facial recognition. Second, this study focuses on the influence of privacy concerns, perceived risk, perceived ease of use, perceived usefulness, and trust on the actual use of face recognition but has not assessed whether other factors, such as user experience and usage habits, affect the actual use of face recognition. In addition, this study only analyzes the direct impact of the research variables on the actual use but fails to account for the impact of the mediating variables or moderator variables.

Future Research Directions

Although this research provides some interesting insights, it has some significant limitations. First, future research should conduct research on different age groups to study the acceptance of face use and attention to privacy at different ages.

Second, privacy is one of the most critical ethical issues in the era of mobile data services. In the current age dominated by big data, privacy issues have become more prominent due to over-identification, technical flaws, and lagging legal construction. In this information era, the connection characteristics of the Internet pose a particularly unique information privacy threat, and many databases and records have led to the privacy boundary continually expanding. How do we balance technological enabling with privacy protection? What should users do about the privacy paradox? The different social cultures and psychology between China and the West cause people to use face recognition differently.

In terms of the impact of Western culture on face recognition, the culture pays attention to privacy and freedom, and politics and social culture affect the use of face recognition. The error and discrimination of face recognition algorithm will cause great psychological harm, coupled with the impact of social culture, and lead to social contradictions. For example, after testing the face recognition systems of Microsoft, Facebook, IBM, and other companies at MIT, it was found that the error rate of women with darker skin color is 35% higher than that of men with lighter skin color. In this regard, the algorithm was suspected to exhibit gender and racial discrimination. The algorithm is designed by people. Developers may embed their values in the algorithm, so there are artificial bias factors, which will lead to social contradictions. Therefore, politics, society, and culture have affected the governance attitude of the West. In terms of social background, religious contradictions and ethnic contradictions in Western society have intensified, and ethnic minorities have been discriminated against for a long time. The West is highly sensitive to prejudices caused by differences in religious beliefs, ethnic groups, and gender. Culturally and psychologically, the West attaches great importance to personal privacy and absolute freedom. Europeans regard privacy as dignity, and Americans regard privacy as freedom. These are some of the new problems we should focus on resolving now.

The core element of cognitive science is cognition, which is also known as information processing. Cognitive science and artificial intelligence are closely linked. The American philosopher J.R. Searle indicated that in the history of cognitive science, computers are key. Without digital computers, there would be no cognitive science ( Baumgartner and Payr, 1995 ). It is particularly important in the research of face recognition and cognitive science. Whether people use face or not has a great relationship with their cognition, consciousness, psychology, and culture. The global workspace theory of Baars, a psychologist, posits that the brain is a modular information processing device composed of many neurons, and the information processing process is composed of different neurons with different divisions of labor and functions. The distributed operation process of specialized modules. The rapidly changing neuronal activity process constructs a virtual space called the global workspace at any given time through competition and cooperation between modules. Consciousness and unintentional state are generated through competition in the workspace. The generation of consciousness refers to all specialized modules in the brain responding to these new stimuli at the same time and analyzing and integrating this stimulus information in the global workspace through competition and cooperation until the best matching effect is achieved in the information processing between modules ( Baars, 1988 ). Andrejevic and Volcic (2019) believes that exposing his face to the machine is in the interest of “efficiency” in this new world situation, creating contradictions with religious and cultural traditions. Face recognition largely depends on the exact meaning given to them by a wide range of actors, such as government, businesses, and civil society organizations ( Norval and Prasopoulou, 2017 ).

Finally, there is the consideration of face recognition and privacy management. Governments and enterprises should strengthen the management and design of face recognition technology. The technology itself is neutral, and the intelligent measures developing from online to offline, as in the case of facial recognition, are targeted at efficient, convenient, and humanized services. Thus, the public must be willing to disclose their personal information to experience the benefits of the use of intelligent media entirely. As scholars have declared, “It is the default transaction rule in the data age to give up part of privacy for the fast operation” ( Mao, 2019 ). Therefore, for the technology of face recognition at a crossroads, on the one hand, one cannot give up the application of technology because of privacy security. Instead, we should rely on smart hardware systems to empower cities and life with innovative technologies.

On the other hand, we cannot abuse this facial recognition technology after only viewing its bright prospects. Data security is always a crucial factor. Therefore, we believe that for face recognition technology, we must balance security, convenience, and privacy, strengthen the research on privacy issues in the field of big data networks, pay attention to the data flow behind it, constrain the technology with other evolving technologies, and cultivate the privacy literacy of the public.

Data Availability Statement

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author/s.

Ethics Statement

The studies involving human participants were reviewed and approved by the Secretariat of Academic Committee, Hangzhou Dianzi University. The participants provided their written informed consent to participate in this study.

Author Contributions

TL and BY: conceptualization, software, and formal analysis. TL, SD, and YG: methodology and validation. TL and SD: investigation, resources, and data curation. TL, YG, and BY: writing—original draft preparation and visualization. TL, BY, and SD: writing—review and editing. All authors have read and agreed to the published version of the manuscript.

We are grateful for the financial support from the Zhejiang Social Science Planning “Zhijiang Youth Project” Academic Research and Exchange Project: Social Science Research in the Era of AI (22ZJQN06YB), the Special Fund of Fundamental Research Funds for Universities Directly Under the Zhejiang Provincial Government (GK199900299012-207), and Excellent Backbone Teacher Support Program of Hangzhou Dianzi University (YXGGJS).

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Acknowledgments

We are highly appreciative of the invaluable comments and advice from the editor and the reviewers.

  • ^ www.wjx.cn

Andrejevic, M., and Volcic, Z. (2019). “smart” cameras and the operational enclosure. Telev. New Media 22, 1–17.

Google Scholar

Baars, B. J. (1988). A Cognitive Theory of Consciousness. Cambridge: Cambridge University Press.

Baumgartner, P., and Payr, S. (1995). Speaking Minds: interviews with Twenty Eminent Cognitive Scientists. New Jersey: Princeton University Press. 204.

Bhattacherjee, A. (2001). Understanding information systems continuance: an expectation-confirmation model. Mis Q. 25, 351–370. doi: 10.2307/3250921

CrossRef Full Text | Google Scholar

Biometrics Identity Standardization [BIS] (2020). 2020 Face Recognition Industry Research Report. Available online at: http://sc37.cesinet.com/view-0852f50939dd442daa42f566c950e336-fe654ac1ec464ae7b780f9fd78553c79.html [Accessed December 25, 2020]

Caputo, F., Mazzoleni, A., Pellicellic, A. C., and Muller, J. (2020). Over the mask of innovation management in the world of Big Data. J. Bus. Res. 119, 330–338. doi: 10.1016/j.jbusres.2019.03.040

Caputo, F., Scuotto, V., Carayannis, E., and Cillo, V. (2018). Intertwining the internet of things and consumers’ behaviour science: future promises for businesses. Technol. Forecast. Soc. Change 136, 277–284. doi: 10.1016/j.techfore.2018.03.019

Cha, J. (2010). Factors affecting the frequency and amount of social networking site use: motivations, perceptions, and privacy concerns. First Monday 15, 12–16. doi: 10.5210/fm.v15i12.2889

Chen, R. (2013). Living a private life in public social networks: an exploration of member self-disclosure. Decis. Supp. Syst. 55, 661–668. doi: 10.1016/j.dss.2012.12.003

Chen, X. Y., and Cliquet, G. (2020). The blocking effect of privacy concerns in the “Quantified Self” movement–a case study of the adoption behavior of smart bracelet users. Enterpr. Econ. 4:109.

Chuah, S. H. W., Rauschnabel, P. A., Krey, N., Nguyen, B., Ramayah, T., and Lade, S. (2016). Wearable technologies: the role of usefulness and visibility in smartwatch adoption. Comp. Hum. Behav. 65, 276–284. doi: 10.1016/j.chb.2016.07.047

Ciampi, F., Demi, S., Magrini, A., Marzi, G., and Papa, A. (2021). Exploring the impact of big data analytics capabilities on business model innovation: the mediating role of entrepreneurial orientation. J. Bus. Res. 123, 1–13. doi: 10.1016/j.jbusres.2020.09.023

Ciasullo, M. V., Cosimato, S., and Pellicano, M. (2017). Service Innovations in the Healthcare Service Ecosystem: a Case Study. Systems 5, 2–19.

CNNIC (2020). The 45th China Statistical Report on Internet Development. Available online at: http://www.gov.cn/xinwen/2020-04/28/content_5506903.htm . [Accessed April 28, 2020]

Davis, F. D. (1989). Perceived usefulness, perceived ease of use, and user acceptance of information technology. Mis. Q. 13, 319–340. doi: 10.2307/249008

Del Giudice, M., Scuottoc, V., Garcia-Perezd, A., and Messeni Petruzzellie, A. (2019). Shifting wealth II in Chinese economy. the effect of the horizontal technology spillover for SEMs for international growth. Technol. Forecast. Soc. Change 145, 307–316. doi: 10.1016/j.techfore.2018.03.013

Del Giudice, M., Scuottoc, V., Orlando, B., and Mustilli, M. (2021). Toward the human-Centered approach. A revised model of individual acceptance of AI. Hum. Resour. Manag. Rev. 100856. doi: 10.1016/j.hrmr.2021.100856

Elias, C., Francesco, C., and Del Giudice, M. (2017). “Technology transfer as driver of smart growth: a quadruple/quintuple innovation framework approach,” Proceedings of the 10th Annual Conference of the EuroMed Academy of Business (Cyprus: EuroMed Press) 313–333.

Fletcher-Brown, J., Carter, D., Pereira, V., and Chandwani, R. (2020). Mobile technology to give a resource-based knowledge management advantage to community health nurses in an emerging economies context. J. Knowledge Manag. 25, 525–544. doi: 10.1108/jkm-01-2020-0018

Fornell, C., and Larcker, D. F. (1981). Evaluating structural equation models with unobservable variables and measurement error. J. Mark. Res. 18, 39–50. doi: 10.2307/3151312

Gefen, D., and Straub, D. (2005). A practical guide to factorial validity using PLS-Graph: tutorial and annotated example. Commun. Assoc. Inform. Syst. 16, 91–109.

Gerbing, D. W., and Anderson, J. C. (1988). An updated paradigm for scale development incorporating unidimensionality and its assessment. J. Market. Res. 25, 186–192. doi: 10.1177/002224378802500207

Guo, R. (2020). Face recognition, equal protection and contract society. Ningbo Econ. 02:42.

He, J. P., and Huang, X. X. (2020). The smartphone use and eudaimonic well-being of urban elderly: based on intergenerational support and TAM. J. Int. Commun. 03, 49–73.

Hichang, C. (2010). Determinants of behavioral responses to online privacy: the effects of concern, risk beliefs, self-efficacy, and communication sources on self-protection strategies. J. Inform. Privacy Secur. 1, 3–27. doi: 10.1080/15536548.2010.10855879

Hsieh, P. J., and Lai, H. M. (2020). Exploring people’s intentions to use the health passbook in self-management: an extension of the technology acceptance and health behavior theoretical perspectives in health literacy. Technol. Forecast. Soc. Change 161:120328. doi: 10.1016/j.techfore.2020.120328

Jiang, J. (2019). Infringement risks and control strategies on the application of face recognition technology. Library Inform. 5:59.

Joinson, A. N., Reips, U. D., Buchanan, T., and Schofield, C. B. P. (2010). Privacy, trust, and self-disclosure online. Hum. Comp. Interact. 25, 1–24. doi: 10.1080/07370020903586662

Kalantari, M., and Rauschnabel, P. (2018). “Exploring the Early Adopters of Augmented Reality Smart Glasses: the Case of Microsoft Hololens” in Augmented Reality and Virtual Reality. Ed T. Jung and M. Tom Dieck (Germany: Springer). 229–245. doi: 10.1007/978-3-319-64027-3_16

Kaur, S., Gupta, S., Singh, S. K., and Perano, M. (2019). Organizational ambidexterity through global strategic partnerships: a cognitive computing perspective. Technol. Forecast. Soc. Change 145, 43–54. doi: 10.1016/j.techfore.2019.04.027

Khasawneh, O. Y. (2018). Technophobia without boarders: the influence of technophobia and emotional intelligence on technology acceptance and the moderating influence of organizational climate. Comp. Hum. Behav. 88, 210–218. doi: 10.1016/j.chb.2018.07.007

Lee, S. Y., and Lee, K. (2018). Factors that influence an individual’s intention to adopt a wearable healthcare device: the case of a wearable fitness tracker. Technol. Forecast. Soc. Change 129, 154–163. doi: 10.1016/j.techfore.2018.01.002

Lee, Y., Kozar, K. A., and Larsen, K. R. T. (2003). The technology acceptance model: past, present, and future. Commun. Assoc. Inform. Syst. 12, 752–780.

Liu, W. W. (2013). Research on the Influence of Privacy Concerns on Users’ Intention to Use Mobile Payment. Beijing: Beijing University of Posts and Telecommunications.

Lunney, A., Cunningham, N. R., and Eastin, M. S. (2016). Wearable fitness technology: a structural investigation into acceptance and perceived fitness outcomes. Comp. Hum. Behav. 65, 114–120. doi: 10.1016/j.chb.2016.08.007

Magni, D., Scuotto, V., Pezzi, A., and Del Giudice, M. (2021). Employees’ acceptance of wearable devices: Towards a predictive model. Technol. Forecast. Soc. Change 172:121022. doi: 10.1016/j.techfore.2021.121022

Mao, Y. N. (2019). The first case of face recognition: what is the complaint? Fangyuan Mag. 24, 14–17.

McKnight, D. H., Lankton, N., and Tripp, J. (2011). “Social Networking Information Disclosure and Continuance Intention: a Disconnect” in 2011 44th Hawaii International Conference on System Sciences (HICSS 2011). (United States: IEEE).

Nasir, S., and Yurder, Y. (2015). Consumers’ and physicians’ perceptions about high tech wearable health products. Proc. Soc. Behav. Sci. 195, 1261–1267.

Nguyen, T.-M., and Malik, A. (2021). Employee acceptance of online platforms for knowledge sharing: exploring differences in usage behavior. J. Knowledge Manag. Epub online ahead of print. doi: 10.1108/JKM-06-2021-0420

Norberg, P. A., Horne, D. R., and Horne, D. A. (2007). The Privacy Paradox: personal Information Disclosure Intentions versus Behaviors. J. Consum. Affairs 41, 100–126. doi: 10.1111/j.1745-6606.2006.00070.x

Norval, A., and Prasopoulou, E. (2017). Public faces? A critical exploration of the diffusion of face recognition technologies in online social network. New Media Soc. 4, 637–654. doi: 10.1177/1461444816688896

Nunnally, J. C., and Bernstein, I. H. (1994). Psychometric Theory. New York: McGraw-Hill.

Papa, A., Mital, M., Pisano, P., and Del Giudice, M. (2020). E-health and wellbeing monitoring using smart healthcare devices: an empirical investigation. Technol. Forecast. Soc. Change 153:119226. doi: 10.1016/j.techfore.2018.02.018

Putnam, R. D. (2001). Making Democracy Work: civic Traditions in Modern Italy (trans. by Wang L & Lai H R). Nanchang: Jiangxi People’s Publishing House. 195.

Qi, K. P., and Li, Z. Z. (2018). A Study on Privacy Concerns of Chinese Public and Its Influencing Factors. Sci. Soc. 2, 36–58.

Reyes-Mercado, P. (2018). Adoption of fitness wearables: insights from Partial Least Squares and Qualitative Comparative Analysis. J. Syst. Inform. Technol. 20, 103–127. doi: 10.1108/jsit-04-2017-0025

Scherer, R., Siddiq, F., and Tondeur, J. (2019). The technology acceptance model (TAM): a meta-analytic structural equation modeling approach to explaining teachers’ adoption of digital technology in education. Comp. Educ. 128, 13–35. doi: 10.1016/j.compedu.2018.09.009

Simmel, G. (2002). Sociology: investigations on the Forms of Sociation (trans. by Lin R Y). Beijing: Huaxia Publishing House. 244–275.

Turhan, G. (2013). An assessment towards the acceptance of wearable technology to consumers in Turkey: the application to smart bra and t-shirt products. J. Textile Inst. 104, 375–395. doi: 10.1080/00405000.2012.736191

Venkatesh, V., and Davis, F. D. (2000). A theoretical extension of the technology acceptance model: four longitudinal field studies. Manag. Sci. 46, 186–204. doi: 10.1287/mnsc.46.2.186.11926

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, H., Lee, M. K. O., and Wang, C. (1998). Consumer privacy concerns about Internet marketing. Commun. ACM 41, 63–70. doi: 10.1145/272287.272299

Wang, Q., and Sun, X. (2016). Investigating gameplay intention of the elderly using an extended technology acceptance model (ETAM). Technol. Forecast. Soc. Change 107, 59–68. doi: 10.1016/j.techfore.2015.10.024

Werts, C. E., Linn, R. L., and Jöreskog, K. G. (1974). Intraclass reliability estimates: testing structural assumptions. Educ. Psychol. Measur. 34, 25–33. doi: 10.1177/001316447403400104

Xue, K., He, J., and Yu, M. Y. (2016). Research on Influencing Factors of Privacy Paradox in Social Media. Contempor. Commun. 1:5.

Yang, H., Yu, J., Zo, H., and Choi, M. (2016). User acceptance of wearable devices: an extended perspective of perceived value. Elemat. Inform. 33, 256–269.

Yu, J. (2018). Research on the Use Intention of VR Glasses Based on the Technology Acceptance Model. Shenzhen: Shenzhen University.

Zahid, A., and Ajita, R. (2017). A Face in any Form: new Challenges and Opportunities for Face Recognition Technology. IEEE Comp. 50, 80–90. doi: 10.1109/mc.2017.119

Zhang, Q. J., and Gong, H. S. (2018). An Empirical Study on Users Behavioral Intention of Face Identification Mobile Payment. Theor. Pract. Fin. Econom. 5, 109–115.

Zhang, X. J., and Li, Z. Z. (2018). Research on the Influence of Privacy Concern on Smartphone Users’ Behavior Intention in Information Security. Inform. Stud. Theor. Appl. 2, 77–78.

Keywords : face recognition, technology acceptance model, social cognitive, cross-culture, privacy concerns psychology, perceived risk, trust, cultural psychology

Citation: Liu T, Yang B, Geng Y and Du S (2021) Research on Face Recognition and Privacy in China—Based on Social Cognition and Cultural Psychology. Front. Psychol. 12:809736. doi: 10.3389/fpsyg.2021.809736

Received: 05 November 2021; Accepted: 06 December 2021; Published: 24 December 2021.

Reviewed by:

Copyright © 2021 Liu, Yang, Geng and Du. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Tao Liu, [email protected]

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Book cover

Proceedings of the Computational Methods in Systems and Software

CoMeSySo 2023: Data Analytics in System Engineering pp 493–502 Cite as

Research of the Correlation Between the Results of Detection the Liveliness of a Face and Its Identification by Facial Recognition Systems

  • Aleksandr A. Shnyrev   ORCID: orcid.org/0009-0008-0194-1861 12 ,
  • Ramil Zainulin 12 ,
  • Daniil Solovyev   ORCID: orcid.org/0009-0007-1791-4553 12 ,
  • Maxim S. Isaev   ORCID: orcid.org/0009-0006-7876-0095 12 ,
  • Timur V. Shipunov   ORCID: orcid.org/0009-0006-3397-4380 12 ,
  • Timur R. Abdullin   ORCID: orcid.org/0009-0001-1891-894X 11 ,
  • Sergei A. Kesel   ORCID: orcid.org/0000-0003-2917-1287 11 ,
  • Denis A. Konstantinov   ORCID: orcid.org/0009-0005-1855-4049 11 &
  • Ilya V. Ovsyannikov   ORCID: orcid.org/0009-0001-5234-1080 11  
  • Conference paper
  • First Online: 24 February 2024

Part of the Lecture Notes in Networks and Systems book series (LNNS,volume 935)

In this paper, the hypothesis is investigated that a system capable of solving the problem of face-anti-spoofing with biometric authentication is capable of partially solving the recognition problem without additional recognition modules, by finding and excluding those persons who have a low probability of being successfully recognized. To do this, the paper considers the device of the basic facial recognition system, highlighting the role of the anti-spoofing module. Other approaches to face recognition and selection of images that do not contain faces have also been studied. The problem under study is formalized and presented in mathematical form for further experiments. During a series of experiments on selected data sets, the results were obtained and visualized, proving the absence of a relationship between the operation of the anti-spoofing module and the facial recognition module. In conclusion, plans for further work in this direction are also presented. #COMESYSO1120.

  • machine learning
  • facial recognition
  • biometric authentication
  • anti-spoofing

This is a preview of subscription content, log in via an institution .

Wang, M., Deng, W.: Deep face recognition: a survey. Neurocomputing 429 , 215–244 (2021). https://doi.org/10.1016/j.neucom.2020.10.081

Article   Google Scholar  

Firc, A., Malinka, K., Hanáček, P.: Deepfakes as a threat to a speaker and facial recognition: an overview of tools and attack vectors. Heliyon 9 , e15090 (2023). https://doi.org/10.1016/j.heli-yon.2023.e15090

Sivapriyan, R., Pavan, Kumar, N., Suresh, H.L.: Analysis of facial recognition techniques. In: Materials Today, Proceedings, vol. 57, pp. 2350–2354 (2022). https://doi.org/10.1016/j.matpr.2022.01.296

Hassani, A., Malik, H.: Securing facial recognition: the new spoofs and solutions. Biometric Technol. Today 2021 , 5–9 (2021). https://doi.org/10.1016/S0969-4765(21)00059-X

Wang, G., et al.: Silicone mask face anti–spoofing detection based on visual saliency and facial motion. Neurocomputing 458 , 416–427 (2021). https://doi.org/10.1016/j.neu-com.2021.06.033

Deng, W., Hu, J., Lu, J., Guo, J.: Transform–invariant PCA: a unified approach to fully automatic facealignment, representation, and recognition. IEEE Trans. Pattern Anal. Mach. In-tell 36 , 1275–1284 (2014). https://doi.org/10.1109/TPAMI.2013.194

Yang, X., et al.: Stable and compact face recognition via unlabeled data driven sparse repre-sentation–based classification. Signal Process. Image Commun. 111 , 116889 (2023). https://doi.org/10.1016/j.image.2022.116889

Ahonen, T., Hadid, A., Pietikainen, M.: Face description with local binary patterns: application to face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 28 , 2037–2041 (2006). https://doi.org/10.1109/TPAMI.2006.244

Deng, J., Guo, J., Xue, N., Zafeiriou, S.: ArcFace: additive angular margin loss for deep face recognition. In: IEEE/CVF Computer Vision and Pattern Recognition, pp. 4685–4694 (2019). https://doi.org/10.1109/CVPR.2019.00482

Liu, W., et al.: Sphereface: deep hypersphere embedding for face recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 212–220 (2017). https://doi.org/10.1109/CVPR.2017.713

Wang, H., et al.: Cosface: large margin cosine loss for deep face recognition. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5265–5274 (2018). https://doi.org/10.1109/CVPR.2018.00552

Wong, K.-W., et al.: A robust scheme for live detection of human faces in color images. Sign. Process. Image Commun. 18 , 103–114 (2003). https://doi.org/10.1016/S0923-5965(02)00088-7

Wang, L., Ding, X., Fang, C.: Face live detection method based on physiological motion analysis. Tsinghua Sci. Technol. 14 , 685–690 (2009). https://doi.org/10.1016/S1007-0214(09)70135-X

Shu, X., et al.: Face spoofing detection based on multi–scale color inversion dual–stream convolutional neural network. Expert Syst. Appl. 224 , 119988 (2023). https://doi.org/10.1016/j.eswa.2023.119988

Pei, M., Yan, B., Hao, H., Zhao, M.: Person-specific face spoofing detection based on a siamese network. Pattern Recogn. 135 , 109148 (2023). https://doi.org/10.1016/j.patcog.2022.109148

Chang, H.–H., Yeh, C.–H.: Face anti–spoofing detection based on multi–scale image quality assessment. Image Vision Comput. 121 , 104428 (2022). https://doi.org/10.1016/j.ima-vis.2022.104428

Chen, S., et al.: A simple and effective patch–based method for frame–level face anti–spoof-ing. Pattern Recogn. Lett. 171 , 1–7 (2023). https://doi.org/10.1016/j.patrec.2023.04.011

Kumar, S., Singh, S., Kumar, J.: A comparative study on face spoofing attacks. In: International Conference on Computing, Communication and Automation (ICCCA), pp. 1104–1108 (2017). https://doi.org/10.1109/CCAA.2017.8229961

Boulkenafet, Z., Komulainen, J., Hadid A.: Face anti–spoofing based on color texture analysis. In: IEEE International Conference on Image Processing (ICIP), pp. 2636–2640 (2015). https://doi.org/10.1109/ICIP.2015.7351280

Dear, M., Harrison, W.: The influence of visual distortion on face recognition. Cortex 146 , 238–249 (2022). https://doi.org/10.1016/j.cortex.2021.10.008

Sandler, M., et al.: Mobilenetv2: inverted residuals and linear bottlenecks. IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018). https://doi.org/10.1109/CVPR.2018.00474

Tan, X., et al.: Face liveness detection from a single image with sparse low rank bilinear discriminative model. Comput. Vis. ECCV 6316 , 504–517 (2010). https://doi.org/10.1007/978-3-642-15567-3_37

Zhang, Z., et al.: A face anti-spoofing database with diverse attacks. In: 5th IAPR International Conference on Biometrics (ICB), pp. 26–31 (2012). https://doi.org/10.1109/ICB.2012.6199754

Download references

Author information

Authors and affiliations.

Moscow Polytechnic University, 107023, Bolshaya Semyonovskaya Street, 38, Moscow, Russia

Timur R. Abdullin, Sergei A. Kesel, Denis A. Konstantinov & Ilya V. Ovsyannikov

JSC Social Card, Republic of Tatarstan, 420124, Meridiannaya Street, 4, Room 1, Kazan, Russia

Aleksandr A. Shnyrev, Ramil Zainulin, Daniil Solovyev, Maxim S. Isaev & Timur V. Shipunov

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Sergei A. Kesel .

Editor information

Editors and affiliations.

Faculty of Applied Informatics, Tomas Bata University in Zlin, Zlin, Czech Republic

Radek Silhavy

Petr Silhavy

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Cite this paper.

Shnyrev, A.A. et al. (2024). Research of the Correlation Between the Results of Detection the Liveliness of a Face and Its Identification by Facial Recognition Systems. In: Silhavy, R., Silhavy, P. (eds) Data Analytics in System Engineering. CoMeSySo 2023. Lecture Notes in Networks and Systems, vol 935. Springer, Cham. https://doi.org/10.1007/978-3-031-54820-8_40

Download citation

DOI : https://doi.org/10.1007/978-3-031-54820-8_40

Published : 24 February 2024

Publisher Name : Springer, Cham

Print ISBN : 978-3-031-54819-2

Online ISBN : 978-3-031-54820-8

eBook Packages : Intelligent Technologies and Robotics Intelligent Technologies and Robotics (R0)

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

research paper of face recognition

RECOMMENDED READS

  • I-JEPA: The first AI model based on Yann LeCun’s vision for more human-like AI
  • Celebrating 10 years of FAIR: A decade of advancing the state-of-the-art through open research
  • Turing Award presented to Yann LeCun, Geoffrey Hinton, and Yoshua Bengio
  • Today, we’re publicly releasing the Video Joint Embedding Predictive Architecture (V-JEPA) model, a crucial step in advancing machine intelligence with a more grounded understanding of the world.
  • This early example of a physical world model excels at detecting and understanding highly detailed interactions between objects.
  • In the spirit of responsible open science, we’re releasing this model under a Creative Commons NonCommercial license for researchers to further explore.

As humans, much of what we learn about the world around us—particularly in our early stages of life—is gleaned through observation. Take Newton’s third law of motion: Even an infant (or a cat) can intuit, after knocking several items off a table and observing the results, that what goes up must come down. You don’t need hours of instruction or to read thousands of books to arrive at that result. Your internal world model—a contextual understanding based on a mental model of the world—predicts these consequences for you, and it’s highly efficient.

“V-JEPA is a step toward a more grounded understanding of the world so machines can achieve more generalized reasoning and planning,” says Meta’s VP & Chief AI Scientist Yann LeCun, who proposed the original Joint Embedding Predictive Architectures (JEPA) in 2022. “Our goal is to build advanced machine intelligence that can learn more like humans do, forming internal models of the world around them to learn, adapt, and forge plans efficiently in the service of completing complex tasks.”

Video JEPA in focus

V-JEPA is a non-generative model that learns by predicting missing or masked parts of a video in an abstract representation space. This is similar to how our Image Joint Embedding Predictive Architecture (I-JEPA) compares abstract representations of images (rather than comparing the pixels themselves). Unlike generative approaches that try to fill in every missing pixel, V-JEPA has the flexibility to discard unpredictable information, which leads to improved training and sample efficiency by a factor between 1.5x and 6x.

Because it takes a self-supervised learning approach, V-JEPA is pre-trained entirely with unlabeled data. Labels are only used to adapt the model to a particular task after pre-training. This type of architecture proves more efficient than previous models, both in terms of the number of labeled examples needed and the total amount of effort put into learning even the unlabeled data. With V-JEPA, we’ve seen efficiency boosts on both of these fronts.

With V-JEPA, we mask out a large portion of a video so the model is only shown a little bit of the context. We then ask the predictor to fill in the blanks of what’s missing—not in terms of the actual pixels, but rather as a more abstract description in this representation space.

research paper of face recognition

Masking methodology

V-JEPA wasn’t trained to understand one specific type of action. Instead it used self-supervised training on a range of videos and learned a number of things about how the world works. The team also carefully considered the masking strategy—if you don’t block out large regions of the video and instead randomly sample patches here and there, it makes the task too easy and your model doesn’t learn anything particularly complicated about the world.

It’s also important to note that, in most videos, things evolve somewhat slowly over time. If you mask a portion of the video but only for a specific instant in time and the model can see what came immediately before and/or immediately after, it also makes things too easy and the model almost certainly won’t learn anything interesting. As such, the team used an approach where it masked portions of the video in both space and time, which forces the model to learn and develop an understanding of the scene.

Efficient predictions

Making these predictions in the abstract representation space is important because it allows the model to focus on the higher-level conceptual information of what the video contains without worrying about the kind of details that are most often unimportant for downstream tasks. After all, if a video shows a tree, you’re likely not concerned about the minute movements of each individual leaf.

One of the reasons why we’re excited about this direction is that V-JEPA is the first model for video that’s good at “frozen evaluations,” which means we do all of our self-supervised pre-training on the encoder and the predictor, and then we don’t touch those parts of the model anymore. When we want to adapt them to learn a new skill, we just train a small lightweight specialized layer or a small network on top of that, which is very efficient and quick.

research paper of face recognition

Previous work had to do full fine-tuning, which means that after pre-training your model, when you want the model to get really good at fine-grained action recognition while you’re adapting your model to take on that task, you have to update the parameters or the weights in all of your model. And then that model overall becomes specialized at doing that one task and it’s not going to be good for anything else anymore. If you want to teach the model a different task, you have to use different data, and you have to specialize the entire model for this other task. With V-JEPA, as we’ve demonstrated in this work, we can pre-train the model once without any labeled data, fix that, and then reuse those same parts of the model for several different tasks, like action classification, recognition of fine-grained object interactions, and activity localization.

research paper of face recognition

Avenues for future research...

While the “V” in V-JEPA stands for “video,” it only accounts for the visual content of videos thus far. A more multimodal approach is an obvious next step, so we’re thinking carefully about incorporating audio along with the visuals.

As a proof of concept, the current V-JEPA model excels at fine-grained object interactions and distinguishing detailed object-to-object interactions that happen over time. For example, if the model needs to be able to distinguish between someone putting down a pen, picking up a pen, and pretending to put down a pen but not actually doing it, V-JEPA is quite good compared to previous methods for that high-grade action recognition task. However, those things work on relatively short time scales. If you show V-JEPA a video clip of a few seconds, maybe up to 10 seconds, it’s great for that. So another important step for us is thinking about planning and the model’s ability to make predictions over a longer time horizon.

...and the path toward AMI

To date, our work with V-JEPA has been primarily about perception—understanding the contents of various video streams in order to obtain some context about the world immediately surrounding us. The predictor in this Joint Embedding Predictive Architecture serves as an early physical world model: You don’t have to see everything that’s happening in the frame, and it can tell you conceptually what’s happening there. As a next step, we want to show how we can use this kind of a predictor or world model for planning or sequential decision-making.

We know that it’s possible to train JEPA models on video data without requiring strong supervision and that they can watch videos in the way an infant might—just observing the world passively, learning a lot of interesting things about how to understand the context of those videos in such a way that, with a small amount of labeled data, you can quickly acquire a new task and ability to recognize different actions.

V-JEPA is a research model, and we’re exploring a number of future applications. For example, we expect that the context V-JEPA provides could be useful for our embodied AI work as well as our work to build a contextual AI assistant for future AR glasses. We firmly believe in the value of responsible open science, and that’s why we’re releasing the V-JEPA model under the CC BY-NC license so other researchers can extend this work.

Our latest updates delivered to your inbox

Subscribe to our newsletter to keep up with Meta AI news, events, research breakthroughs, and more.

Join us in the pursuit of what’s possible with AI.

research paper of face recognition

Latest Work

Our Actions

Meta © 2024

Help | Advanced Search

Computer Science > Computer Vision and Pattern Recognition

Title: magic-me: identity-specific video customized diffusion.

Abstract: Creating content for a specific identity (ID) has shown significant interest in the field of generative models. In the field of text-to-image generation (T2I), subject-driven content generation has achieved great progress with the ID in the images controllable. However, extending it to video generation is not well explored. In this work, we propose a simple yet effective subject identity controllable video generation framework, termed Video Custom Diffusion (VCD). With a specified subject ID defined by a few images, VCD reinforces the identity information extraction and injects frame-wise correlation at the initialization stage for stable video outputs with identity preserved to a large extent. To achieve this, we propose three novel components that are essential for high-quality ID preservation: 1) an ID module trained with the cropped identity by prompt-to-segmentation to disentangle the ID information and the background noise for more accurate ID token learning; 2) a text-to-video (T2V) VCD module with 3D Gaussian Noise Prior for better inter-frame consistency and 3) video-to-video (V2V) Face VCD and Tiled VCD modules to deblur the face and upscale the video for higher resolution. Despite its simplicity, we conducted extensive experiments to verify that VCD is able to generate stable and high-quality videos with better ID over the selected strong baselines. Besides, due to the transferability of the ID module, VCD is also working well with finetuned text-to-image models available publically, further improving its usability. The codes are available at this https URL .

Submission history

Access paper:.

  • Download PDF
  • Other Formats

References & Citations

  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

IMAGES

  1. (PDF) A Review Paper on FACIAL RECOGNITION

    research paper of face recognition

  2. (PDF) Study of Face Recognition Techniques: A Survey

    research paper of face recognition

  3. (PDF) Deep Learning Face Detection and Recognition

    research paper of face recognition

  4. Factsheet: Facial Recognition Technology (FRT)

    research paper of face recognition

  5. Biometric Authentication & Biometric Identification: Explained With

    research paper of face recognition

  6. (PDF) Face detection and Recognition: A review

    research paper of face recognition

COMMENTS

  1. A Review of Face Recognition Technology

    The paper introduces the related researches of face recognition from different perspectives. The paper describes the development stages and the related technologies of face recognition. We introduce the research of face recognition for real conditions, and we introduce the general evaluation standards and the general databases of face recognition.

  2. (PDF) Face Recognition: A Literature Review

    The task of face recognition has been actively researched in recent years. This paper provides an up-to-date review of major human face recognition research. We first present an...

  3. A review on face recognition systems: recent approaches and ...

    1 Introduction Face recognition (FR) has over recent years been an active research area due to the various applications it can be applied, such as border security, surveillance, law enforcement and access control.

  4. Face Recognition by Humans and Machines: Three Fundamental Advances

    First, deep networks trained for face identification generate a representation that retains structured information about the face (e.g., identity, demographics, appearance, social traits, expression) and the input image (e.g., viewpoint, illumination).

  5. Past, Present, and Future of Face Recognition: A Review

    Face recognition is one of the most active research fields of computer vision and pattern recognition, with many practical and commercial applications including identification, access control, forensics, and human-computer interactions.

  6. A comprehensive study on face recognition: methods and challenges

    13 CrossRef citations to date 0 Altmetric Review A comprehensive study on face recognition: methods and challenges Parekh Payal & Mahesh M. Goyani Pages 114-127 | Received 01 Feb 2019, Accepted 27 Feb 2020, Published online: 27 Mar 2020 Cite this article https://doi.org/10.1080/13682199.2020.1738741 Full Article Figures & data References Citations

  7. Face Detection Research Paper

    Face detectors are equipped with a photo of 2500 left or right eyes and the snapshots of the eyestrain terrible sets. Overall advantageous 94 percent and fake-fantastic thirteen percent are detected in facial detection. Eyes are detected at a fee of 88 percentages with the simplest 1 percent false nice outcome.

  8. Face Recognition: Recent Advancements and Research Challenges

    A Review of Face Recognition Technology: In the previous few decades, face recognition has become a popular field in computer-based application development This is due to the fact that it is employed in so many different sectors. Face identification via database photographs, real data, captured images, and sensor images is also a difficult task due to the huge variety of faces. The fields of ...

  9. [2103.14983] Going Deeper Into Face Detection: A Survey

    Going Deeper Into Face Detection: A Survey. Face detection is a crucial first step in many facial recognition and face analysis systems. Early approaches for face detection were mainly based on classifiers built on top of hand-crafted features extracted from local image regions, such as Haar Cascades and Histogram of Oriented Gradients. However ...

  10. Human face recognition based on convolutional neural network and

    Research Article Human face recognition based on convolutional neural network and augmented dataset Peng Lu , Baoye Song & Lin Xu Pages 29-37 | Received 10 Jul 2020, Accepted 10 Oct 2020, Published online: 27 Oct 2020 Cite this article https://doi.org/10.1080/21642583.2020.1836526 In this article Full Article Figures & data References Citations

  11. (PDF) Face detection and Recognition: A review

    ... Facial recognition is defined as the process of identifying or verifying a person from a digital image, this process takes place by comparing the captured facial images against the images...

  12. Research on face recognition based on deep learning

    As a powerful technology to realize artificial intelligence, deep learning has been widely used in handwriting digital recognition, dimension simplification, speech recognition, image comprehension, machine translation, protein structure prediction and emotion recognition. In this paper, we focus on the research hotspots of face recognition ...

  13. Design and Evaluation of a Real-Time Face Recognition System using

    In this paper, design and evaluation of a real-time face recognition system using Convolutional Neural Network (CNN) is proposed. The initial evaluation of the proposed design is carried out using standard AT&T datasets and the same is later extended towards the design of a real-time system.

  14. (PDF) A Review of Face Recognition Technology

    The paper introduces the related researches of face recognition from different perspectives. The paper describes the development stages and the related technologies of face recognition....

  15. A deep facial recognition system using computational intelligent ...

    1. Introduction 2. Literature review 3. Material and methods 4. Proposed facial recognition system 5. Experimental results 6. Conclusion References Reader Comments Figures Abstract The development of biometric applications, such as facial recognition (FR), has recently become important in smart cities.

  16. Research on Face Image Digital Processing and Recognition Based on Data

    Because face recognition is greatly affected by external environmental factors and the partial lack of face information challenges the robustness of face recognition algorithm, while the existing methods have poor robustness and low accuracy in face image recognition, this paper proposes a face image digital processing and recognition based on data dimensionality reduction algorithm.

  17. Research on Face Recognition Algorithm Based on Image Processing

    The three face recognition algorithms based on kernel method, KPCA, KFDA and KFDA based on null space, have high recognition ability. When the RBF kernel function is selected and the parameter is set to σ 2 = 5 × 106, the face recognition algorithm based on the kernel method has good recognition performance. Go to: 6.

  18. [2212.13038] A Survey of Face Recognition

    A Survey of Face Recognition. Xinyi Wang, Jianteng Peng, Sufang Zhang, Bihui Chen, Yi Wang, Yandong Guo. Recent years witnessed the breakthrough of face recognition with deep convolutional neural networks. Dozens of papers in the field of FR are published every year. Some of them were applied in the industrial community and played an important ...

  19. Research on Face Recognition and Privacy in China—Based on Social

    In this paper, the ease of using a face recognition application refers to users' perceived ease of use of the technology. Previously, ... In this study, taking the users of face recognition as the research objects, the TAM was integrated, and variables such as privacy concerns, perceived risk, and trust were added to the model to analyze the ...

  20. A Comprehensive Review on Attendance System Using Face Recognition

    Search 216,913,746 papers from all fields of science. Search. Sign ... A COMPREHENSIVE REVIEW ON ATTENDANCE SYSTEM USING FACE RECOGNITION TECHNOLOGY @article{2023ACR, title={A COMPREHENSIVE REVIEW ON ATTENDANCE SYSTEM USING FACE RECOGNITION TECHNOLOGY}, author={}, journal={International Research Journal of Modernization in Engineering ...

  21. Research of the Correlation Between the Results of Detection ...

    The basic facial recognition system is arranged as follows: an image from some source (camera, Internet resource, ... Cite this paper. Shnyrev, A.A. et al. (2024). Research of the Correlation Between the Results of Detection the Liveliness of a Face and Its Identification by Facial Recognition Systems. In: Silhavy, R., Silhavy, P. (eds) Data ...

  22. Research on human behaviour recognition method of sports images based

    In order to improve the efficiency of human behaviour recognition in sports images, a human behaviour recognition method based on machine learning is proposed. ... Face recognition belongs to the problem of non-linear, which increases the difficulty of its recognition. ... This paper present a tree-based least square twin support vector machine ...

  23. (PDF) Face Detection and Recognition Using OpenCV

    ... The aim is to select the appropriate approach of face recognition so that we can achieve a system TH. Hasan et al. [9] proposed a system that can efficiently detect faces & objects in...

  24. The role of the differential outcomes procedure and schizotypy in the

    Emotional facial expression recognition is a key ability for adequate social functioning. The current study aims to test if the differential outcomes procedure (DOP) may improve the recognition of dynamic facial expressions of emotions and to further explore whether schizotypal personality traits may have any effect on performance. 183 undergraduate students completed a task where a face ...

  25. [2201.02991] A Survey on Face Recognition Systems

    Face Recognition has proven to be one of the most successful technology and has impacted heterogeneous domains. Deep learning has proven to be the most successful at computer vision tasks because of its convolution-based architecture. Since the advent of deep learning, face recognition technology has had a substantial increase in its accuracy. In this paper, some of the most impactful face ...

  26. Image-based Face Detection and Recognition: "State of the Art"

    Face recognition from image or video is a popular topic in biometrics research. Many public places usually have surveillance cameras for video capture and these cameras have their significant value for security purpose.

  27. Face Detection and Recognition Using Machine Learning

    In this paper, for face detection we are using HOG (Histogram of oriented Gradient) based face detector which gives more accurate results rather than other machine learning algorithms like...

  28. V-JEPA: The next step toward advanced machine intelligence

    V-JEPA is a non-generative model that learns by predicting missing or masked parts of a video in an abstract representation space. This is similar to how our Image Joint Embedding Predictive Architecture (I-JEPA) compares abstract representations of images (rather than comparing the pixels themselves). Unlike generative approaches that try to fill in every missing pixel, V-JEPA has the ...

  29. Magic-Me: Identity-Specific Video Customized Diffusion

    Computer Science > Computer Vision and Pattern Recognition. arXiv:2402.09368 (cs) ... Zhen Dong, Kurt Keutzer, Jiashi Feng. Download a PDF of the paper titled Magic-Me: Identity-Specific Video Customized Diffusion, by Ze Ma and 8 other authors. Download PDF ... (V2V) Face VCD and Tiled VCD modules to deblur the face and upscale the video for ...