Mobile face recognition technology and its ways of implementation

Earlier this year, Google introduced a new product: the Firebase Machine Learning Kit. ML Kit allows you to leverage the power of machine learning in Android and iOS applications and mobile face recognition. In this post, we will talk about how to use it to create an Android application for real-time face detection.

Face detection is just one of the computer vision capabilities that Firebase ML Kit offers (or rather makes it easier to use). This is a feature that can be useful in many applications: recognizing people in photos, working with selfies, adding emoji and other effects while taking a photo, taking pictures only when everyone is smiling with open eyes, etc. The possibilities are endless.

Physiognomy in a scientific way

Digital face recognition — an identification or confirmation of a person’s face using neural networks — is becoming a new reality that is increasingly entering our lives. Smartphones have long learned to find faces in photographs, social networks offer to tag friends in photographs and cameras on the streets and in transport “snatch” criminals from the crowd.

OpenCV is an open-source library for computer vision, image processing, and numerical algorithms with a fairly wide range of functionality. We have actively used it in our products, for example, the OCR Solutions system — a service for recognizing registration certificates, international passports, and other documents. Overall, OpenCV is an integral part of our AI and computer vision development stack. The library is implemented in C / C ++, there is support for Python, Java, Ruby, Matlab, Lua, and other languages.

Mobile face recognition technology and its ways of implementation

We are confident that we have what it takes to help you get your platform from the idea throughout design and development phases, all the way to successful deployment in a production environment!

Possibilities of OpenCV in working with mobile face recognition

One of the basic features that we often use. OpenCV is a fairly simple and straightforward API: if we want to cut out a part of the original image, we specify the coordinates, crop, and continue working. The first experiments in the field of machine face recognition were presented in the 1960s by Woody Bledsoe, a professor at the University of Texas at Austin and an artificial intelligence researcher.

His working group created a database of 800 pictures of people from different angles in a biometric photo mobile recognition system. Next, the scientists marked the faces with 46 coordinate points using a prototype of a modern tablet. Using a special algorithm, the system rotated faces at different angles, zoomed in and out. In the second step, the algorithm used 22 measurements, acting by Bayesian decision theory — so that the overall conclusion was as accurate as possible. As a result, the system developed by Bledsoe performed 100 times faster than a human. Important stages of mobile face recognition development:

In 1988, Michael Kirby and Lawrence Sirovich of Brown University applied the Eigenface approach using linear algebra to analyze images. They used fewer than 100 different values to mark up faces.
In 1991, Alex Pentland and Matthew Turk of MIT refined the Eigenfaces technology by harnessing environmental factors. They managed to automate the recognition process.
In the late 1990s, the Defense Advanced Research Projects Agency (DARPA) and the National Institute of Standards and Technology released the FERET program with the largest database of faces — more than 14 thousand images. It was originally used to find and recognize criminals around the world, but then it was released to the public.
Since 2010, Facebook has started using facial recognition to find users in posted photos and invite users to tag them.

OpenCV has quite powerful functionality for detecting (defining) objects in an image. This is realized thanks to pre-trained models based on neural networks. In the example that we will consider below, a trained model was used to determine the frontal image of a face in a photo (head rotation up to about 30 degrees is allowed). OpenCV allows you to find faces very quickly. At the same time, we do not identify the person’s identity, but with a certain degree of certainty, we can assume that the face is in this particular part of the frame. How does this happen?

Several words as a conclusion

The principle of “sliding window” (Viola-Jones method) in mobile face recognition is used to search for faces. For each area of the image, over which the window passes, the Haar feature is calculated. The presence or absence of an object in the window is determined by the difference between the feature value and the learning threshold. The window slides over the entire image. Each time the image is traversed, it enlarges to find larger-scale faces.

The Haar feature consists of contiguous rectangular regions. They are positioned on the image, then the pixel intensities in the areas are summed up, after which the difference between the sums is calculated. This difference will be the value of a certain feature (size), positioned in a certain way on the image. What all facial images have in common is that the area around the eyes is darker than the area around the cheeks. Therefore, a common Haar sign for faces is 2 adjacent rectangular regions lying on the eyes and cheeks.