Earlier this year, Google introduced a new product: the Firebase Machine Learning Kit. ML Kit allows you to leverage the power of machine learning in Android and iOS applications and mobile face recognition. In this post, we will talk about how to use it to create an Android application for real-time face detection.
Face detection is just one of the computer vision capabilities that Firebase ML Kit offers (or rather makes it easier to use). This is a feature that can be useful in many applications: recognizing people in photos, working with selfies, adding emoji and other effects while taking a photo, taking pictures only when everyone is smiling with open eyes, etc. The possibilities are endless.
Digital face recognition — an identification or confirmation of a person’s face using neural networks — is becoming a new reality that is increasingly entering our lives. Smartphones have long learned to find faces in photographs, social networks offer to tag friends in photographs and cameras on the streets and in transport “snatch” criminals from the crowd.
OpenCV is an open-source library for computer vision, image processing, and numerical algorithms with a fairly wide range of functionality. We have actively used it in our products, for example, the OCR Solutions system — a service for recognizing registration certificates, international passports, and other documents. Overall, OpenCV is an integral part of our AI and computer vision development stack. The library is implemented in C / C ++, there is support for Python, Java, Ruby, Matlab, Lua, and other languages.
One of the basic features that we often use. OpenCV is a fairly simple and straightforward API: if we want to cut out a part of the original image, we specify the coordinates, crop, and continue working. The first experiments in the field of machine face recognition were presented in the 1960s by Woody Bledsoe, a professor at the University of Texas at Austin and an artificial intelligence researcher.
His working group created a database of 800 pictures of people from different angles in a biometric photo mobile recognition system. Next, the scientists marked the faces with 46 coordinate points using a prototype of a modern tablet. Using a special algorithm, the system rotated faces at different angles, zoomed in and out. In the second step, the algorithm used 22 measurements, acting by Bayesian decision theory — so that the overall conclusion was as accurate as possible. As a result, the system developed by Bledsoe performed 100 times faster than a human. Important stages of mobile face recognition development:
OpenCV has quite powerful functionality for detecting (defining) objects in an image. This is realized thanks to pre-trained models based on neural networks. In the example that we will consider below, a trained model was used to determine the frontal image of a face in a photo (head rotation up to about 30 degrees is allowed). OpenCV allows you to find faces very quickly. At the same time, we do not identify the person’s identity, but with a certain degree of certainty, we can assume that the face is in this particular part of the frame. How does this happen?
The principle of “sliding window” (Viola-Jones method) in mobile face recognition is used to search for faces. For each area of the image, over which the window passes, the Haar feature is calculated. The presence or absence of an object in the window is determined by the difference between the feature value and the learning threshold. The window slides over the entire image. Each time the image is traversed, it enlarges to find larger-scale faces.
The Haar feature consists of contiguous rectangular regions. They are positioned on the image, then the pixel intensities in the areas are summed up, after which the difference between the sums is calculated. This difference will be the value of a certain feature (size), positioned in a certain way on the image. What all facial images have in common is that the area around the eyes is darker than the area around the cheeks. Therefore, a common Haar sign for faces is 2 adjacent rectangular regions lying on the eyes and cheeks.
See what running a business is like with Global Cloud Team on your development. Please submit the form below and we will get back to you within 24 - 48 hours. Or call us at: +1 800 903 94 16