1Face Detection using Haar Cascades {#tutorial_js_face_detection}
2==================================
3
4Goal
5----
6
7-   learn the basics of face detection using Haar Feature-based Cascade Classifiers
8-   extend the same for eye detection etc.
9
10Basics
11------
12
13Object Detection using Haar feature-based cascade classifiers is an effective method proposed by Paul Viola and Michael Jones in the 2001 paper, "Rapid Object Detection using a
14Boosted Cascade of Simple Features". It is a machine learning based approach in which a cascade
15function is trained from a lot of positive and negative images. It is then used to detect objects in
16other images.
17
18Here we will work with face detection. Initially, the algorithm needs a lot of positive images
19(images of faces) and negative images (images without faces) to train the classifier. Then we need
20to extract features from it. For this, Haar features shown in below image are used. They are just
21like our convolutional kernel. Each feature is a single value obtained by subtracting the sum of pixels
22under the white rectangle from the sum of pixels under the black rectangle.
23
24![image](images/haar_features.jpg)
25
26Now all possible sizes and locations of each kernel are used to calculate plenty of features. For each
27feature calculation, we need to find the sum of the pixels under the white and black rectangles. To solve this,
28they introduced the integral images. It simplifies calculation of the sum of the pixels, how large may be
29the number of pixels, to an operation involving just four pixels.
30
31But among all these features we calculated, most of them are irrelevant. For example, consider the
32image below. Top row shows two good features. The first feature selected seems to focus on the
33property that the region of the eyes is often darker than the region of the nose and cheeks. The
34second feature selected relies on the property that the eyes are darker than the bridge of the nose.
35But the same windows applying on cheeks or any other place is irrelevant. So how do we select the
36best features out of 160000+ features? It is achieved by **Adaboost**.
37
38![image](images/haar.png)
39
40For this, we apply each and every feature on all the training images. For each feature, it finds the
41best threshold which will classify the faces to positive and negative. But obviously, there will be
42errors or misclassifications. We select the features with minimum error rate, which means they are
43the features that best classifies the face and non-face images. (The process is not as simple as
44this. Each image is given an equal weight in the beginning. After each classification, weights of
45misclassified images are increased. Then again same process is done. New error rates are calculated.
46Also new weights. The process is continued until required accuracy or error rate is achieved or
47required number of features are found).
48
49Final classifier is a weighted sum of these weak classifiers. It is called weak because it alone
50can't classify the image, but together with others forms a strong classifier. The paper says even
51200 features provide detection with 95% accuracy. Their final setup had around 6000 features.
52(Imagine a reduction from 160000+ features to 6000 features. That is a big gain).
53
54So now you take an image. Take each 24x24 window. Apply 6000 features to it. Check if it is face or
55not. Wow.. Wow.. Isn't it a little inefficient and time consuming? Yes, it is. Authors have a good
56solution for that.
57
58In an image, most of the image region is non-face region. So it is a better idea to have a simple
59method to check if a window is not a face region. If it is not, discard it in a single shot. Don't
60process it again. Instead focus on region where there can be a face. This way, we can find more time
61to check a possible face region.
62
63For this they introduced the concept of **Cascade of Classifiers**. Instead of applying all the 6000
64features on a window, group the features into different stages of classifiers and apply one-by-one.
65(Normally first few stages will contain very less number of features). If a window fails the first
66stage, discard it. We don't consider remaining features on it. If it passes, apply the second stage
67of features and continue the process. The window which passes all stages is a face region. How is
68the plan !!!
69
70Authors' detector had 6000+ features with 38 stages with 1, 10, 25, 25 and 50 features in first five
71stages. (Two features in the above image is actually obtained as the best two features from
72Adaboost). According to authors, on an average, 10 features out of 6000+ are evaluated per
73sub-window.
74
75So this is a simple intuitive explanation of how Viola-Jones face detection works. Read paper for
76more details.
77
78Haar-cascade Detection in OpenCV
79--------------------------------
80
81Here we will deal with detection. OpenCV already contains many pre-trained classifiers for face,
82eyes, smile etc. Those XML files are stored in opencv/data/haarcascades/ folder. Let's create a face
83and eye detector with OpenCV.
84
85We use the function: **detectMultiScale (image, objects, scaleFactor = 1.1, minNeighbors = 3, flags = 0, minSize = new cv.Size(0, 0), maxSize = new cv.Size(0, 0))**
86
87@param image               matrix of the type CV_8U containing an image where objects are detected.
88@param objects             vector of rectangles where each rectangle contains the detected object. The rectangles may be partially outside the original image.
89@param scaleFactor         parameter specifying how much the image size is reduced at each image scale.
90@param minNeighbors        parameter specifying how many neighbors each candidate rectangle should have to retain it.
91@param flags               parameter with the same meaning for an old cascade as in the function cvHaarDetectObjects. It is not used for a new cascade.
92@param minSize             minimum possible object size. Objects smaller than this are ignored.
93@param maxSize             maximum possible object size. Objects larger than this are ignored. If maxSize == minSize model is evaluated on single scale.
94
95@note Don't forget to delete CascadeClassifier and RectVector!
96
97Try it
98------
99
100Try this demo using the code above. Canvas elements named haarCascadeDetectionCanvasInput and haarCascadeDetectionCanvasOutput have been prepared. Choose an image and
101click `Try it` to see the result. You can change the code in the textbox to investigate more.
102
103\htmlonly
104<iframe src="../../js_face_detection.html" width="100%"
105        onload="this.style.height=this.contentDocument.body.scrollHeight +'px';">
106</iframe>
107\endhtmlonly