Digikam/GSoC2019/AIFaceRecognition: Difference between revisions

From KDE Community Wiki
Line 11: Line 11:
   <li>Deep Neural Network (DNN) DLib <br>
   <li>Deep Neural Network (DNN) DLib <br>
         This is an experimental implementation of a neural network to perform faces recognition. <br>  
         This is an experimental implementation of a neural network to perform faces recognition. <br>  
         This DNN is based on DLib code, a low-level library used by OpenFace project. This code works, but it slow and complex to maintain. It is rather a proof of concept than being used for productive use. <br>  
         This DNN is based on the DLib code, a low-level library used by OpenFace project. This code works, but it slow and complex to maintain. It is rather a proof of concept than being used for productive use. <br>  
Moreover, the documentation in the source code is non-existent.
Moreover, the documentation in the source code is non-existent.


       The code of Dlib is mostly the machine learning core implementation of http://dlib.net/ml.html and https://sourceforge.net/p/dclib/wiki/Known_users.  
       The code of Dlib is mostly the machine learning core implementation of [http://dlib.net/ml.html Dlib C++  Library] and referenced in projects in [https://sourceforge.net/p/dclib/wiki/Known_users the Dlib users list on SourceForge].  
<br> <br>
<br> <br>
   </li>
   </li>
   <li> [https://docs.opencv.org/2.4/modules/contrib/doc/facerec/facerec_tutorial.html#local-binary-patterns-histograms OpenCV] -  LBPH<br>
   <li> [https://docs.opencv.org/2.4/modules/contrib/doc/facerec/facerec_tutorial.html#local-binary-patterns-histograms OpenCV] -  [https://en.wikipedia.org/wiki/Local_binary_patterns Local Binary Patterns Histograms] ([http://www.scholarpedia.org/article/Local_Binary_Patterns LBPH])<br>
This is the most complete implementation of a face detection algorithm. Moreover, it is the oldest implementation of such an algorithm in digiKam. It's not perfect and requires at least six faces already tagged manually by the user to identify the same faces in non-tagged images. <br>
This is the most complete implementation of a face detection algorithm. Moreover, it is the oldest implementation of such an algorithm in digiKam. It's not perfect and requires at least six faces already tagged manually by the user to identify the same faces in non-tagged images. <br>
This algorithm records a histogram of the face in the database, which is used later to perform the comparisons against new/non-tagged faces.  
This algorithm records a histogram of the face in the database, which is used later to perform the comparisons against new/non-tagged faces.  
This one use OpenCV backend.https://towardsdatascience.com/face-recognition-how-lbph-works-90ec258c3d6b
This one use OpenCV backend based on [https://towardsdatascience.com/face-recognition-how-lbph-works-90ec258c3d6b Towards Data Science - Face Recognition: Understanding LBPH Algorithm].
<br> <br>
</li>   
</li>   
<br> <br>
<li>[https://docs.opencv.org/2.4/modules/contrib/doc/facerec/facerec_tutorial.html#eigenfaces OpenCV] - [https://en.wikipedia.org/wiki/Eigenface Eigen Faces] <br>
<li>[https://docs.opencv.org/2.4/modules/contrib/doc/facerec/facerec_tutorial.html#eigenfaces OpenCV] - [https://en.wikipedia.org/wiki/Eigenface Eigen Faces] <br>
An alternative algorithm what uses the OpenCV backend. It was introduced to have a different source of results for face detection, enabling to proof the DNN approaches.
An alternative algorithm what uses the OpenCV backend. It was introduced to have a different source of results for face detection, enabling to proof the DNN approaches.
Line 33: Line 33:
</ol>
</ol>
<br>
<br>
There is a paper explaining the difference between Fisher and Eigen Faces, see [http://disp.ee.ntu.edu.tw/~pujols/Eigenfaces%20and%20Fisherfaces.pdf Eigenfaces and Fisherfaces - Presenter: Harry Chao aMMAI 2010 .pdf]
There is a paper explaining the difference between Fisher and Eigen Faces, see [http://disp.ee.ntu.edu.tw/~pujols/Eigenfaces%20and%20Fisherfaces.pdf Eigenfaces and Fisherfaces - Presenter: Harry Chao - Multimedia  Analysis  and  Indexing –Course 2010 .pdf]


==why so many different approaches?==
==why so many different approaches?==


:The idea why four different algorithms were implemented is simply to be able to make a comprehnsive assessment of the currently available technologies applicable in digiKam and eventually choose the best one.  <br>
: The idea why four different algorithms were implemented is simply to be able to make a comprehnsive assessment of the currently available technologies applicable in digiKam and eventually choose the best one.  <br>
:The student who worked on the DNN project a few years ago has concluded that DNN was the best method to recognize with less error as possible. Unfortunately, the training and recognition process took too long and slowed down the application.
: The student who worked on the DNN project a few years ago has concluded that DNN was the best method to recognize faces with little error rate as possible. Unfortunately, the training and recognition process took too long and slowed down the application.
Regardless that fall-back, it is agreed that DNN is the best way to go, but not using the current implementation based on DLib.
 
: Regardless that fall-back, it is agreed that DNN is the best way to go, but the current implementation based on DLib shall not be used.


=prevoius work=
=prevoius work=

Revision as of 10:15, 10 March 2019

Introduction

Hello reader,
This article describes the current state of the face detection algorithms of digiKam and the desired outcome of the corresponding GSoC project.
It is recommended to read Faces Management workflow improvements, as this describes the entire face management workflow. Thus it helps to understand the scope of these algorithms and where it need clarification about its structure and interfaces with other parties (code modules).

Currently, there are four different methods using the corresponding algorithm, which are more or less operational. The used algorithm can be chosen in the one Face Scan dialogue.
The goal is to be able to recognize automatically faces in images, which are not tagged, using a previous face tag registered in the face recognition database. The algorithms are complex but explained in more detail below.

currently implemented face recognition algorithms

  1. Deep Neural Network (DNN) DLib
    This is an experimental implementation of a neural network to perform faces recognition.
    This DNN is based on the DLib code, a low-level library used by OpenFace project. This code works, but it slow and complex to maintain. It is rather a proof of concept than being used for productive use.
    Moreover, the documentation in the source code is non-existent. The code of Dlib is mostly the machine learning core implementation of Dlib C++ Library and referenced in projects in the Dlib users list on SourceForge.

  2. OpenCV - Local Binary Patterns Histograms (LBPH)
    This is the most complete implementation of a face detection algorithm. Moreover, it is the oldest implementation of such an algorithm in digiKam. It's not perfect and requires at least six faces already tagged manually by the user to identify the same faces in non-tagged images.
    This algorithm records a histogram of the face in the database, which is used later to perform the comparisons against new/non-tagged faces. This one use OpenCV backend based on Towards Data Science - Face Recognition: Understanding LBPH Algorithm.

  3. OpenCV - Eigen Faces
    An alternative algorithm what uses the OpenCV backend. It was introduced to have a different source of results for face detection, enabling to proof the DNN approaches.

  4. OpenCV - Fisher Face
    Another algorithm what uses the OpenCV backend. It was introduced for the same purposes as Eigen Faces.
    According to rumours, this one is not finalized, it is said that not all methods are implemented.


There is a paper explaining the difference between Fisher and Eigen Faces, see Eigenfaces and Fisherfaces - Presenter: Harry Chao - Multimedia Analysis and Indexing –Course 2010 .pdf

why so many different approaches?

The idea why four different algorithms were implemented is simply to be able to make a comprehnsive assessment of the currently available technologies applicable in digiKam and eventually choose the best one.
The student who worked on the DNN project a few years ago has concluded that DNN was the best method to recognize faces with little error rate as possible. Unfortunately, the training and recognition process took too long and slowed down the application.
Regardless that fall-back, it is agreed that DNN is the best way to go, but the current implementation based on DLib shall not be used.

prevoius work

  1. DNN
All code was introduced by a student Yingjie Liu <[email protected]> in a previous GoSC project
Lui papers:

code

All the low level steps to train and recognize faces are done in this class. For the middle level codes, muti-threaded and chained, started by the face scan dialog, all is here: https://cgit.kde.org/digikam.git/tree/core/utilities/facemanagement?h=development/dplugins

database

Which kind of info is stored in the database? This depends on the recognition algorithm used. Histograms, Vector, Binary data, all responsible for algorithm computation, and of course all not compatible. Typically, when you change the recognition algorithm in Face Scan dialogue, the database must be clear as well. But in fact this kind of database mechanism must be dropped, when DNN algorithm will be finalized, and only this one retained to do the job. As I said previously, 4 algorithms are implemented to choose the best one. At the end, only one must still in digiKam face engine, and all the code must be simplified.

Expected results of this GSoc 2019 project

With 3.x versions, OpenCV has introduced a DNN API. I shall be used instead of the others approaches as done for the face detection


requirements on the student(s)

Typically, the student must review all Bugzilla entries will be presented in separated subsection created by the maintainers. If this page does not provide enough guidance, the student(s) must identify the top level entries to engage but with help by the listed mentors. The student is expected to work autonomous technically-wise, so the answers to challenges will not be found necessarilly by the support of the maintainer. This does not mean that the maintainers cannot be reached by the student. Guidance will be given at any time in any case but shall that be limited to occasional situations to allow the maintainers to follow up on their work.
Regardless of the above-mentioned channel of communication, the maintainers review and validate the code in their development branch bevor merging it to the master branch.

Besides coding, it is required a technical proposal, where to list :

  • the problematic,
  • the code outlining, being merged into the master branch
  • the tests
  • the overall project plan for this summer,
  • documentation to write (mostly in code), etc.