Face Recognition Product Technology White Paper Next
Author: huifan Time: 2021-08-18
The first application scenario classification
Face access control systems, if classified according to application scenarios, can be divided into three categories: strong fit class, semi-fit class and natural pass class.
Strong fit class applications
Strong fit class applications are mainly applied to traditional near-infrared face access control, intelligent locks, cabinets and devices. Individual, family, small and medium-sized enterprise face recognition access control or equipment application, often user face library scale is small, equipment cost is low, sometimes need equipment to provide battery power supply low-power scenarios and other application needs, this application scenario is often strong cooperation class application.
In the case of strong cooperation application, the typical face database capacity is between 50-1000 people, and the face needs to be within 0.5 meters from the device during recognition, and the angle between the face and the device is within 15 degrees almost completely positive face state. The face recognition technology of the strong fit class application has limited accuracy requirements, low equipment cost and low power consumption, and is often applicable to small-scale places.
Face bank capacity: less than 1000 people
Face recognition distance: within 1 meter
Face detection and matching search time: <3 seconds
Face recognition angle: within 15 degrees, requiring special cooperation
Semi-compatible class application
The semi-cooperative class is mainly applied in scenes such as building floor access control. In application scenarios such as corporate floors, office access control, elevators, community unit entrances, etc., it is usually the enterprise or community that replaces traditional IC card or key access control with face access control and applies the use of semi-compatible class face access control.
Semi-compatible class face access control face library within 10,000 people, access control when using the distance between the person and the device between 0.5 meters to 1.5 meters, you can achieve a certain angle between the face and the device such as within 30 degrees, and even in the ideal situation to achieve the person to the door open without stopping to pass state.
Face library capacity: less than 10,000 people
Face comparison distance: 0.5-1.5 meters
Face detection and matching search time: <1 second
Face recognition angle: within 30 degrees, semi-cooperative state
Natural passage class
The natural peer class is the application that people don't need to cooperate with the system, such as the indoor and outdoor application of the park building entrance and exit and the natural sign-in of the face. In residential communities, entrances and exits of large buildings, large performances or exhibition activities, usually need natural passage of face access control products. Natural pass face access control often face library size up to about 10,000 to 100,000 people, the distance between the face and the device can be about 1 to 3 meters, and can be suitable for a variety of light environments such as indoor and outdoor, people can pass completely without stopping to pass and face comparison analysis.
Face database capacity: 1-10 million people
Face recognition distance: 1-3 meters
Face detection and matching search time: <0.5 seconds
Face recognition angle: within 45 degrees, consistent with the passage route, does not affect the natural passage state
The development trend of face access control application is the transition from strong cooperation to semi-compatible, and then to the trend of natural passage, technology makes life services become more and more convenient. At the same time in the strong with the type of application field from the near-infrared corporate face recognition, to the family face door lock expansion, but because the face door lock on the face recognition technology anti-attack ability to bring a new challenge, face access control development status still exists in more uncertain factors.
The second product type grading
The technical grade of face access control products is divided into six grades: academic grade, entertainment grade, consumer grade, enterprise grade, industry grade, and financial grade.
Level 1: Academic grade
Academic-grade technology accuracy usually refers to repeatable accuracy of 60% or less, for principle testing and theoretical innovation, and often does not necessarily correspond to specific products. Some academic-level technologies can achieve high face recognition accuracy under ideal conditions in the laboratory, but often cannot reproduce the laboratory results in real environment applications due to various reasons such as light, lens, computing resources, etc. Even the theoretical results of some academic papers are difficult to reproduce consistently, and academic-level ones often do not correspond to specific mass-production products.
Second level: entertainment level
The maturity of certain segmentation functions of face recognition is not high in some application fields, such as analyzing people's age and gender through face recognition, or even distinguishing twins or people with highly similar faces through face recognition, etc. The success probability of face recognition is usually below 85%, and technologies with such maturity are usually used to make entertainment-type products.
For example, it is used for attribute analysis of face recognition or entertainment game products.
Level 3: Consumer level
Face recognition applications are usually used in small and medium scale face banks such as within a thousand people, and medium and close distances such as within half a meter when there is a high accuracy of face recognition, face recognition can be used in homes, stores or small and medium enterprises, etc. as products for consumer level applications. At present a large number of face time and attendance machines based on near infrared technology, face access control level are usually consumer grade products .
Fourth level: enterprise level
Enterprise-level face access products are usually used in medium to large-scale enterprises or residential communities, administrative units of intelligent access applications. Face library capacity is usually within 10,000 people, face comparison response time is usually in the second level, application scenarios involve both indoor uniform light and outdoor or building entrances and exits of daylight or strong contrast light conditions. Enterprise-level face access control is the most rapidly growing face access control product segment of the current market.
Level 5: Industry level
In public security, public transportation, large parks, large concerts or exhibition events, etc. for face recognition comparison or public security applications, often need products with industry-grade face recognition technology. The application environment is usually in multiple entrances and exits of large outdoor dental buildings, and the equipment needs to be networked for distributed operation to solve the intelligent passage needs of tens of thousands or even hundreds of thousands of people.
Sixth level: financial level
The theoretical level of biological accuracy of face recognition is within the error of one ten thousandth, but financial payment usually requires the accuracy error of the system to be at the level of one millionth or even lower, and the face recognition product technology used to guarantee financial security is called the technology of financial grade security. At present, face recognition in the field of financial payment still faces many challenges to be overcome.
To sum up:
face recognition products are divided into different levels of technical maturity and are adapted to different application fields. From the current point of view, face access control is in the process of development from consumer-level face access control to enterprise-level, industry-level and even financial-level security face access control. Currently more mature is used for 10,000 people bank of enterprise-class face access products, but also a few tens of thousands of face bank of industry-grade products have begun to appear.
The third part of the main functions of the product
The main functions of face access control products include face imaging acquisition, face image feature extraction and face library comparison management, alarm linkage data analysis several functional modules. Each functional module involves different business characteristics.
Face imaging acquisition
Different face images are obtained through camera lens acquisition, such as static images, dynamic images, different positions, different expressions, etc. When the acquisition object is within the shooting range of the device, the acquisition device will automatically search and shoot the face image. The main factors that affect face imaging acquisition are the performance indicators of the camera and factors such as camera installation and deployment location. Typically face imaging acquisition involves the following factors.
A face image that is too small will affect the recognition effect and a face image that is too large will affect the recognition speed. The image size is to some extent the distance of the face from the camera in practical application scenarios.
Image resolution. The lower the image resolution, the more difficult to recognize. Image size integrated image resolution, directly affect the camera recognition distance.
Camera imaging requires a certain range of light, overexposure or too dark lighting environment will affect the face recognition effect. This is one of the most important factors affecting face recognition. Usually, natural and external artificial light can be used to solve the lighting problem.
The movement of the face relative to the camera will often produce motion blur, resulting in affecting the detection and recognition effect.
Images with no occlusion of the five senses and clear face edges are the best. In the actual scene, many faces are obscured by hats, glasses, masks and other occlusions.
Acquisition angle. The angle of the face for the camera is best for the front face. But the actual scene is often difficult to capture a positive face, will involve the face and the camera between how much angle suitable for face acquisition.
Several parameters of the camera imaging
1. Camera imaging resolution
The camera for portrait acquisition is usually divided into USB cameras, MIPI cameras and web cameras. Among them, USB and MIPI usually transmits uncompressed video data is generally used for close transmission within the device, the network camera usually transmits compressed video data can be used for long-distance transmission between devices. The following is an example of a webcam to describe the imaging resolution of the camera.
720P actually refers to the resolution 1280 x 720 pixels. The calculation is 1280 x 720 = 921600 pixels, 720p or 720i is the megapixel resolution, usually refers to a million network cameras. 720P default image bitstream per channel is 3M, the specific rate and the ratio of video compression is also highly relevant.
960P actually refers to the resolution 1280×960 pixels. Calculation is 1280 × 960 = 1228800 pixels, generally called 960p or 960i for 1.3 megapixel resolution, usually refers to the 1.3 megapixel camera. 960P default image bitstream per channel is 4M, the specific rate and the ratio of video compression is also highly relevant.
1080P actually refers to the resolution of 1920 × 1080 pixels. Calculation is 1920 × 1080 = 2073600 pixels, generally called 1080p or 1080i for 2 megapixel resolution, usually refers to the 2 megapixel camera. 1080P default image bitstream per channel is 5M, the specific rate and the ratio of video compression is also highly relevant.
4K refers to the 3840 horizontal x 2160 vertical (16:9) pixel resolution, supports 120p, 60p, 59.94p, 50p, 30p, 29.97p, 25p, 24p and 23.976p, a total of nine frame rates. 4K default image bitstream per channel is 8M or more, the specific rate and the ratio of video compression is also highly correlated.
2. Light intensity
Face to be imaged by the camera, the primary condition is the face to receive a certain degree of light that is subject to sufficient light intensity. The standard of light intensity belongs to the illuminance, illuminance is a unit reflecting the intensity of light, its physical meaning is the amount of luminous flux irradiated to the unit area, the unit of illuminance is the number of lumens per square meter (Lm), also called lux (Lux): 1 Lux = 1Lm/m2. In the above equation, Lm is the unit of luminous flux, which is defined as the amount of light radiated by pure platinum at melting temperature (about 1770°C) in a stereoscopic angle of 1 spherical degree for its surface area of 1/60 m2.
Face detection requires the face to receive light neither too strong nor too weak, usually requires a brightness between 10-3000lux is more appropriate, too bright or too dark can lead to differences in the camera Yang imaging effect on the face.
The following is a variety of environmental illumination values: (unit lux)
3. Wide dynamic
When in a strong light source (daylight, lamps or reflections, etc.) under the high brightness areas and shadows, backlighting and other areas of relatively low brightness in the image at the same time, the camera output image will appear bright areas due to overexposure become white, and dark areas due to underexposure become black, seriously affecting the image quality. The camera in the same scene on the brightest areas and darker areas of the performance is limited, this limitation is commonly referred to as "dynamic range".
Wide Dynamic Range (WDR) technology is a technique used to allow the camera to see the characteristics of the image under very strong contrast. When high luminance areas under strong light sources (daylight, lamps or reflections, etc.) and relatively low luminance areas such as shadows and backlights are present in the image at the same time, the camera output image will appear bright areas become white due to overexposure, while dark areas become black due to underexposure, seriously affecting the image quality. Camera in the same scene on the brightest areas and darker areas of the performance is limited, this limitation is commonly referred to as "dynamic range".
Wide dynamic range is the image can distinguish the brightest luminance signal value and can distinguish the darkest bright light signal value of the ratio. The expression of wide dynamic is expressed in "multiples" or "dB".
The most common form of wide dynamic is the use of multiple exposure methods, which involve capturing two frames in a short period of time and a long exposure speed. The first exposure captures detail in bright areas of the scene, while the latter captures detail in the darker areas of the scene. The two images are then combined to present both bright and dark area image details at the same time. The usual wide dynamic range is between 50 to 70 dB, with higher performance wide dynamic range from 100 to 130 dB. The most advanced wide dynamic range (also known as the third generation of wide dynamic) is called "true WDR", by capturing four frames for comparison to achieve the best results. Compared to the previous two-frame WDR technology, the increase in the number of frames helps resolve lighting differences between the foreground and background of the image.
4. Face imaging angle
The angle of face imaging corresponds to three angles between the face and the lens: up and down flip pitch, left and right flip yaw, and in-plane rotation roll angle. Ideally, the three angles between the camera and the face are
Up and down flip pitch: the camera imaging position is the same height as the face position, so the up and down flip angle is almost 0.
Left-right flip yaw: the face is facing the camera, and the left-right flip angle is between 0 and 15 degrees.
In-plane rotation roll: the face is in the same line of gravity as the camera, and does not involve the face in-plane rotation.
But the real camera imaging often involves the camera installation position and the real angle of the portrait and other issues, resulting in the camera can not be close to the ideal situation imaging.
5. White balance
In any color temperature conditions, the camera lens captured by the standard white after the adjustment of the circuit, so that the imaging is still white, so that the color of the captured image can be accurately reflected, so the process is called white balance.
White Balance (White Balance) is only used for color cameras, its use is to achieve the camera image can accurately reflect the scene conditions, there are two ways of manual white balance and automatic white balance.
Auto White Balance
Continuous mode: At this time, the white balance setting will be adjusted continuously as the scene color temperature changes, ranging from 2800 to 6000 K. This method is optimal for occasions where the scene color temperature is constantly changing during shooting, so that the color performance is natural, but for scenes with little or no white, continuous white balance does not produce the best color effect.
Button method: first point the camera to a white target such as a white wall, white paper, etc., and then the automatic mode switch from manual to the setting position, retained in the position for a few seconds or until the image appears white, after the white balance is executed, the automatic mode switch back to the manual position to lock the white balance settings, then the white balance settings will remain in the camera's memory until it is changed again, until the execution. The range is 2300~10000K, during which the setting will not be lost even if the camera is powered off. Setting the white balance with a button is the most accurate and reliable, and is suitable for most applications.
Manual White Balance
Turning on the manual white balance will turn off the automatic white balance, when changing the red or blue condition of the image there are up to 107 levels for adjustment, such as increasing or decreasing the red by one level each, increasing or decreasing the blue by one level each. In addition to the second, some cameras also have the white balance fixed at 3200K (incandescent level) and 5500K (daylight level) and other gear commands.
The key relationship between the camera face imaging is: the video or picture screen "face" area to have sufficient imaging light, imaging size and true reproduction of the image such as no distortion and white balance.
Face detection performance
The main metric for face detection in still frames and motion video is the time it takes to process a photo for face detection. Several metrics are typically needed to measure face detection results.
1. Input screen size
Video images usually involve different video resolutions such as CIF, D1, 720P, 1080P, 2K, 4K and above.
The video resolutions usually used for face recognition are 720P and 1080P.
2. Detection speed
The time required to complete a face detection of the specified resolution picture is directly indicated by the processing frame rate of the face recognition image.
Usually in 1080P video requires more than 10 frames per second to complete the video detection, that is, the equivalent of every 100ms to complete a frame of the portrait detection.
3. Detection quantity
Refers to the number of human faces in the same video picture, and face capture application environment is highly relevant. Face and comparison environment usually only 1 person appears in each frame. Face access control and face passage environment requires that each screen can handle 5 human faces. Public transportation such as stations, squares, etc. usually require each screen to handle up to 30-50 faces.
4. Detection accuracy
Refers to the difference between the number of faces detected by the algorithm and the number of real faces when multiple faces appear on the screen.
To sum up: face detection is usually more relevant to the application scenario of the device, such as embedded devices such as face locks or access control, smart hardware often requires only one face recognition on the same screen; intelligent building access control or venue check-in in the same screen often requires 3-10 people face detection; shopping malls, stations or outdoor control sometimes requires the same screen to achieve 10-50 people face detection processing.
1. Cooperative live detection
Usually requires the detected object to face the camera by blinking, open mouth, shaking head, nodding and other combinations of action, or by reading the specified numbers or words to ensure that the operation is a real live face.
2. Non-cooperative natural type live detection
Non-cooperative type live detection, does not require the user to do any action, relying on the camera to capture the face within a certain period of time for the relevant algorithm to determine whether it is a live body, avoiding the use of photos, videos and other non-live speculative behavior.
Live recognition technology has been a technical focus of face recognition, and both face live recognition technology and the face live fraud technology that confronts it are developing rapidly, equivalent to the relationship between spear and shield.
Personnel registration management
Face access control often requires the articulation of a personnel database that compares personnel lists, a process known as personnel registration. Personnel registration generally requires the provision of basic information such as ID information, name, the associated authority area, and the most critical to provide a clear photo of the face matching quality requirements. Face registration can be done by self-registration or batch processing by the administrator.
Face feature extraction and face database management comparison
1. Face feature extraction
A key part of face recognition is to transform the face photos detected from video images into a certain data structure through deep learning or other methods, and this process is called face characterization. After a face photo is characterized, it often forms a 128-dimensional matrix or different feature vectors, and then the feature vectors representing the face are stored and managed, compared and retrieved, etc.
Feature extraction for face recognition is divided into several technical processes.
From the distance and ratio between facial points as features, recognition is fast, memory requirements are relatively small, and sensitivity to light is reduced.
Face image features are extracted according to the different probabilities possessed by different feature states.
Based on statistical features.
Consider the face image as a random vector and use statistical methods to identify different face feature patterns, more typical ones are feature face, independent component analysis, singular value decomposition, etc.
Based on neural network features.
Using a large number of neural units to store and remember face image features associatively, and achieve accurate recognition of face images according to the probability of different neural unit states.
The current mainstream is the feature extraction method based on neural network .
2. Face library comparison and management
According to the different scenarios of practical applications, the face database can be divided into the following capacities.
Ultra-small-scale face database within 100 people.
It is mainly used in personal, family or SME environment for face door lock, face smart cabinet, SME face attendance, etc.
Small-scale face bank within 2000 people.
Mainly used in small enterprises or community unit building face access control and other applications.
Medium-sized face database within 20000 people.
Mainly used in medium-sized enterprises, communities or venues for face access control, face check-in and other applications.
Large-scale face database within 50,000 people.
Used for face access control or large scale event applications in large enterprises, parks and communities.
Large-scale face database for more than 50,000 people.
Mainly used for public security control or larger scale face matching scenarios.
Alarm linkage and data analysis
In face access control applications, when face detection and face matching analysis are completed, face linkage processing is usually involved. According to the response speed of linkage processing, generally divided into the following types.
1. Real-time linkage
Face detection, face comparison and the linkage of the results can meet the completion of more than 5-10 frames per second, ordinary people usually do not feel the delay process of face recognition, this application is called real-time linkage.
Real-time linkage usually requires the overall processing time to be completed within 0.2 seconds.
Real-time linkage is usually used in scenarios such as natural face passage or venue check-in.
2. Second-level linkage
Face access control is completed in about 1 second from the contact to the completion of the whole processing time, ordinary people feel the delay but acceptable state.
Second-level linkage is the main application requirements of face access control, applied to all kinds of parks, buildings or office access control places.
3. Time delay linkage
If it takes more than 1 second to complete the processing time from the appearance of the face to the overall linkage, it is usually called delayed linkage. The size of the delay is related to the application scenario. For example, the membership analysis statistics of the retail mall or the face check-in of the classroom often requires a response at the minute level.
In summary: the main functions of face access control involve: face detection and collection, personnel registration, face library management and linkage alarm and data analysis. The implementation of the function often involves the implementation of different application scenarios and product forms. Generally speaking, the more effective, convenient and cost-effective these functions are, the greater the user satisfaction.
Part IV face access control product form
To achieve the complete function of face access control if you need to combine more than one device to carry out, it is called a combination of face access control devices. A typical face access control combination device involves the following components.
Components of face access control
The camera is used to collect video information, the host computer is used to run the software for face recognition, face library management and registration services, and the client is used for user interaction. Several devices are connected to each other via a network.
Typical equipment for combined face access control is commonly composed of surveillance cameras and computer hosts, and these devices are often used for large face bank management. The advantage of combined face access control is the relatively high performance of the device, the disadvantage is that the implementation of the network is more complex leading to low reliability of the system or maintenance workload.
Integrated face access control equipment means that face recognition and comparison is usually done in the same device, without the need for additional equipment. Common integrated face access control equipment is divided into integrated access control equipment with screen and integrated face access control equipment without screen. At present, face access control equipment to integrated devices and the fastest growth rate.
1. Integrated face access control equipment with screen
Equipment directly integrated camera, screen, computing motherboard, linkage alarm, etc., in a single device to complete the face collection, face registration, library comparison, linkage access control and other functions.
2. Without screen face access control equipment
Equipment directly integrated camera, computing motherboard, linkage alarm, etc., in a single device to complete face acquisition, face registration, library comparison, linkage access control and other functions, but without screen output, can be linked through the sound and light signal access control.
Networking type equipment
1. Local area networking
Access control equipment within the network management of multiple devices through the local area network, you can achieve a unified user registration, face library management and permission management, etc., known as local area network networking equipment or called the enterprise intranet type networking.
2. Internet cloud networking
Access control equipment between if the Internet for distributed management, the client through the cell phone or computer for remote management, known as the Internet cloud networking access control system.
In summary: the product form of face access control is divided into combination products, integrated products, the main products of face access control for integrated access control products. Integrated face access products and usually with a screen integrated face access products, the future of face access camera type products will also have a lot of room for growth. Face access control network is divided into internal enterprise LAN products and Internet-based cloud networking products. Local area network face access control network usually for the enterprise to set up their own management server and a unified centralized management end for the management of the device. Internet-based cloud networking face access control is based on the server on the Internet for Internet networking device management, the client both the computer such traditional equipment, and cell phone mobile client can be used for management.