Digikam/GSoC2019/FacesManagementWorkflowImprovements: Difference between revisions

From KDE Community Wiki
No edit summary
 
(4 intermediate revisions by the same user not shown)
Line 1: Line 1:
Hello reader,
= Introduction =
We begin with the little story, explaining how all the digiKam face recognition related features became a GSoC project. All began in early 2018 as the thread [http://digikam.1695700.n4.nabble.com/digiKam-users-either-face-recognition-screen-is-buggy-or-I-still-don-t-understand-it-at-least-I-can-8-td4705248.html#a4705293 either face recognition screen is buggy or I still don't understand it - at least I can say that more convenient bulk change of face tags (no auto refresh/set faces via context menu) is neccessary] took off. Eventually if found its course in early 2019 what convinced the maintainer of digiKam to refurbish these features earlier than original considered.  
Hello reader, <br>
We begin with a little story, explaining how all the digiKam face recognition related features became a GSoC project. <br>
All began in early 2018 as the thread [http://digikam.1695700.n4.nabble.com/digiKam-users-either-face-recognition-screen-is-buggy-or-I-still-don-t-understand-it-at-least-I-can-8-td4705248.html#a4705293 either face recognition screen is buggy or I still don't understand it - at least I can say that more convenient bulk change of face tags (no auto refresh/set faces via context menu) is neccessary] took off. Eventually, it found its course in early 2019 what convinced the maintainer of digiKam to refurbish these features earlier than originally considered.  
The post what made this change was written on the [http://digikam.1695700.n4.nabble.com/digiKam-users-either-face-recognition-screen-is-buggy-or-I-still-don-t-understand-it-at-least-I-can-8-tp4705248p4707745.html 01.Feb.2019] and describes quite well what has to be polished and redesigned, respectively. If you read the post, you will notice that it content goes beyond the pure face management workflow.  
The post what made this change was written on the [http://digikam.1695700.n4.nabble.com/digiKam-users-either-face-recognition-screen-is-buggy-or-I-still-don-t-understand-it-at-least-I-can-8-tp4705248p4707745.html 01.Feb.2019] and describes quite well what has to be polished and redesigned, respectively. If you read the post, you will notice that it content goes beyond the pure face management workflow.  




= the overall face detection, recognition and management workflow =


{{Construction}}
Before this article goes into the details, an overall description of all involved parts is given in corresponding order.
<ol>
<li> the faces detection  <br>
It is a group of algorithms to analyse the content of images, identify the distinctive regions such as eyes, nose, mouth, etc. Most of them are OpenCV based, and work mostly fine in the background (excepted some technical issues with OpenGL cards acceleration used by OpenCV which introduce instability, but it's another challenge).
These algorithms generate region where a face can be found, typically a rectangle. These areas are written as digikam internal information in digiKams core database. That information will not be added to the metadata of the images yet as this happens during the face recognition workflow, what is explained further down.
<br> <br>
</li>
<li> the faces detection  <br>
This introduces the four different methods based on different algorithms, more and less functional. The goal is to be able to recognize automatically a non-tagged face from images, using previous face tags registered in the database. The algorithms are complex but explained in more details in the wiki page for the GSoC faces recognition project.
The 4 different methods are explained here in brief only, a more detailed description can be found in  [[Digikam/GSoC2019/AIFaceRecognition]]
<br> <br>


Hello
<ol>
1
<li>Deep Neural Network (DNN) [http://dlib.net/ml.html Dlib C++  Library] <br>
until when do you need let's say a dateiled specifcation of how the job shall be done?
DigiKam has already an experimental implementation of Neural Network to perform faces recognition what is rather proof of concept than a production-ready function.
This DNN is based on the Dlib implementation in [http://blog.dlib.net/2014/02/dlib-186-released-make-your-own-object.html OpenFace project].
<br> <br>
</li>
<li> [https://docs.opencv.org/2.4/modules/contrib/doc/facerec/facerec_tutorial.html#local-binary-patterns-histograms OpenCV] -  [https://en.wikipedia.org/wiki/Local_binary_patterns Local Binary Patterns Histograms] ([http://www.scholarpedia.org/article/Local_Binary_Patterns LBPH])<br>
This is the most complete implementation of a face detection algorithm. Moreover, it is the oldest implementation of such an algorithm in digiKam. It's not perfect and requires at least six faces already tagged manually by the user to identify the same faces in non-tagged images.
<br> <br>
</li> 
<li>[https://docs.opencv.org/2.4/modules/contrib/doc/facerec/facerec_tutorial.html#eigenfaces OpenCV] - [https://en.wikipedia.org/wiki/Eigenface Eigen Faces] <br>
An alternative algorithm what uses the OpenCV backend. It was introduced to have a different source of results for face detection, enabling to proof the DNN approaches.
<br> <br>
</li>
<li> [https://docs.opencv.org/2.4/modules/contrib/doc/facerec/facerec_tutorial.html#fisherfaces OpenCV] - [http://www.scholarpedia.org/article/Fisherfaces Fisher Face] <br>
Another algorithm what uses the OpenCV backend. It was introduced for the same purposes as Eigen Faces.  <br>
According to rumours, this one is not finalized, it is said that not all methods are implemented.
<br> <br>
</li>
</ol>


2
<li> The faces workflow <br>
It took me not little hours to get that email done, across two or three weeks, whenever I could spare some time.  
This is the actual subject of this article where the search for a student(s) for the GSoC 2019 is ongoing. There are not any complex algorithms involved here.  <br>
So I would suggest that I email the others who involved themselves in this matter this and last year and try to start to get the main goals on paper, properly in group on pixl.us (as I don't know a better tool what has been recently dicussed).
That is where we switch from the backend, the digiKam core, to the frontend, the GUI. There are numberless posts what could be improved or what is missing. The goal is to answer all those to make the entire workflow flawless, allowing to be widely accepted and enjoyed by the users. This requires some significant effort to assist and guide the student(s) to achieve the desired outcome. As that is related to coding the maintainers would like to so the community to take over here, to take off workload from them and enabling us users to steer the process from a user perspective. The maintainer would only ensure the quality of the code.  <br>
The overall face workflow will not change that much, the changes are mainly under the hood, as mentioned in the chapter above. The process is
<ol>
<li> Detect  </li>
<li> suggest faces </li>
<li> user confirms / correct  </li>
</ol>
but there are many ways to achieve this. That is the place where the hard work begins. The following section tries to give guidance to the entire retrofit process aiming at collecting, outlining and streamlining all suggestions to ensure consistency and intuitive face workflow. <br> <br>


3
I mentioned that, since as far as I know that content of person-related metadata fields are not taken into account when you search or filter a collection by certain keywords. Thus, in order to make the names findable by digiKam, the name has to be added to the keywords related metadata fields to make the magic happen.
I would volunteer to get the current documentation about database and face management updated but I need to know what you think about my wording suggestions as I would have diffculites to stick to the word tag (I know is little attempt at blackmail 😋).
</li>
</ol>


4
= Participation =
You mentioned in GSoC 2019 task descritpion, that it shall focus on face management not face detection. Can this really well sepaerated. I afraid that there will be some overlapping, how to deal with that?
This is a break-down fo the description of how to [https://community.kde.org/GSoC participate in the Summer of Code program with KDE]. <br>
<ol>
<li> AS A MAINTAINER <br>
As a maintainer, you are responsible to know the digiKam to source code and check pull requests of the student(s) before they are merged into the master.  
In addition, you are the contact person for questions of the student(s) in regard to the code and ensure that the student(s)’s documentation is satisfactory.


FEB 4
<br> <br>
Gilles Caulier sent the following messages at 9:12 AM
<li> A MENTORING USER <br>
View Gilles’ profile
If you wish you more than welcome to contact any of the current users, get you an account on kde and joining the discussion here, in the mailing list and begging contributing to this article.  <br>
Gilles Caulier
Volunteered users are: <br>
Gilles Caulier  9:12 AM
[mailto:[email protected] Stefan Müller (user coordinator)]
I will respond not in your order :


4/ The DK faces management is separated in these parts :
<br> <br>
<li> AS A STUDENT  <br>
Typically, the student must review all related Bugzilla entries given in the corresponding Bugzilla section of the project. If this project or the Bugzilla does not provide enough guidance, the student(s) must identify the top level entries to engage but with help by the listed mentors.
The student is expected to work autonomous technically-wise, so the answers to challenges will not be found independently of the support of the maintainer. This does not mean that the maintainers cannot be reached by the student. Guidance will be given at any time in any case but shall that be limited to occasional situations to allow the maintainers to follow up on their work. <br>
Regardless of the above-mentioned channel of communication, the maintainers review and validate the code in their development branch bevor merging it to the master branch.


- The faces detection : It's group algorithms to analyze images contents, identify the interest regions (eyes, nose, mouth, etc). Most of then are OpenCV based, and work mostly fine in background (excepted some technical issues with OpenGL cards acceleration used by OpenCV which introduce instability, but it's another Pb). These algorithms generate region where face can be found, typically a rectangle. This area is linked later in database as a "face tag".
Besides coding, it is required to submit a technical proposal, wherein is to list :  
- The faces database : this is a separated engine dedicated to store faces information : areas, names, and histograms. There are more information, but the most important are there. These info are stored in a separated tables/files.
* the problematic,  
- The faces recognition : This introduce 4 different methods based on different algorithms, more and less functional. The goal is to be able to recognize automatically a non tagged face from images, using a previous face tags registered in database. The algorithms are complex, and one is based on neural network, but it still experimental and not optimized. We have a GoSC project for this summer about this topic, but here again it's another stuff.
* the code outlining, being merged into the master branch
- The faces workflow : this is the subject where we want to found a student for this summer, and i want to share the mentoring we somebody who know well DK in user space. This group all action, widget, methods given to the user to manage face tags by a non automatized way : rename, move, delete, group, register face in database, etc... There is no algorithms here, only the GUI.
* the tests
* the overall project plan for this summer,  
* documentation to write (mostly in code), etc.
</ol>


View Gilles’ profile
=Project tasks=
Gilles Caulier
All relevant bug reports can be found in
Gilles Caulier  9:18 AM
<ul>
4/ (again) : the idea to mentoring a student about Face workflow is to separate 2 topics : the coding space, and the GUI/test/review. I propose to manage the first and to delegate the second.
<li> [https://bugs.kde.org/buglist.cgi?bug_status=UNCONFIRMED&bug_status=CONFIRMED&bug_status=ASSIGNED&bug_status=REOPENED&component=Faces-Workflow&list_id=1595307&product=digikam  digikam Bug List - Component: Faces-Workflow Status: REPORTED, CONFIRMED, ASSIGNED, REOPENED] <br><br>
Typically, the student must review all bugzilla entries that i separated in a subsection. You know some of these (-:=))...
but the recognition entries shall be also present to the student(s).
He must identify the top level entries to fix, with the tips from the mentors of course. He must write a technical proposal, where he must list : the problematic, the code to patch, the coding tasks, the tests, the plan for this summer, and of course the documentation to write (mostly in code), etc.
<li> [https://bugs.kde.org/buglist.cgi?bug_status=__open__&component=Faces-Recognition&list_id=1583144&product=digikam digikam Bug List - Component: Faces-Recognition Status: REPORTED, CONFIRMED, ASSIGNED, REOPENED]
</ul>


View Gilles’ profile
In the following is it tried to group them in major tasks, to give the students detailed guidance on how to close the bug reports
Gilles Caulier
Gilles Caulier  9:23 AM
4/ (again 2) : while this summer, the student must analyze the code in details, identify the problems, and start to patch implementations. While these stage, he will ask Q about coding, and about functionalities. We must respond to both, and i will respond to code stuff in prior...
The student must be autonomous technically. He must search responses by himself, and sometime, he can be blocked, an we must guide to the right direction. In all case, i review and validate the codes..
Important : students work in a separate development branch in git. It safe.


View Gilles’ profile
{{construction}}
Gilles Caulier
'''latest email converstation not refelected yet'''
Gilles Caulier 9:24 AM
<ol>
1/ => se the response given for 4/, more and less...
<li> SEPARATION BETWEEN TAGS AND FACES (by [mailto:[email protected] Stefan Müller]) <br>
Many players in the media business, such as Adobe, use the expression tag for anything related to metadata others separating between the different types of metadata.  <br>
All metadata records are stored in fields (see e.g. photometadata.org) which also often called tags (of the metadata), so a tag is anything that is used in digiKam to filter or search for images, e.g. keywords, colour label, star rating etc... Thus there is to much space for interpretation what leads to all these questions due to irritations caused by the use of the word tag.
In order to lower the entry hurdle into the world of tagging I would suggest to be consistent with the official wording, thus new users won't be confused by this. That means that the text for the tag will be named keyword, so on the source selection pane on the left will be Keywords and in filter pane on the right, it will say Keywords Filter. The description shall rather say close to digital deals with metadata, grouped in (tags of): keywords, label, date and location.
<br><br>
<li> Ensure that all relevant metadata fields are filled (by [mailto:[email protected] Stefan Müller]) <br>
At the end, as soon as a name is confirmed digiKam writes the data to the MP and MWG namespace of the XMP records, it sets a name and area.<br>
More Details about those namespaces can be found here:
* [https://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/Microsoft.html#MP) Microsoft MP Tags (ExifTool)]          
* [http://www.exiv2.org/tags-xmp-MP.html)[Microsoft MP Tags (Exif2.org)]
* [https://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/MWG.html#Regions) MWG Regions Tags (ExifTool)]        
*[http://www.exiv2.org/tags-xmp-mwg-rs.html) MWG Regions Tags (Exif2.org)]


View Gilles’ profile
as Apple and Adobe write their information in the MWG namespace, I would say that MWG is the leading namespace but inconsistent may lead to unexpected behaviour of the applications what reads them.
Gilles Caulier
In my understanding, this information should also be written to the [https://iptc.org/std/photometadata/specification/IPTC-PhotoMetadata#person-structure IPTC Person structure] as mentioned in the [https://www.iptc.org/std/photometadata/documentation/userguide/index.htm#!Documents/personsdepictedintheimage.htm IPTC Photo Metadata User Guide (Persons Depicted in the Image)], but is not.
Gilles Caulier  9:29 AM
It needs to be clarified and documented why that does not happen and may be corrected.
2/ if you talk about about the process to work together with a student, there are plenty of tools :
Link face region with face name properly
- a wiki page that the student and us we will use to list all technical point for the project.
In order to make images findable by a person's name, the name shall also be written to the keywords field of multiple namespaces, [https://www.iptc.org/std/photometadata/documentation/userguide/index.htm#!Documents/personsdepictedintheimage.htm IPTC Photo Metadata User Guide (Persons Depicted in the Image)] recommends caption and keywords. I cannot tell all relevant fields/namespaces. My research tells me that should be at least those:
- the digikam-devel@kde.org mailing list where student can ask Q, and where team respond as a community. We can use only this channel only when students are selected, not before, as students are in competition.
* IPTC ⇒ Keywords, see: [https://iptc.org/std/photometadata/specification/IPTC-PhotoMetadata#keywords iptc.org]
- The private mail work well, but something the thread can be broken if somebody forget to respond to all people. So the mailing list is better, but we must use this way before the student final selection.
* XMP acdsee ([https://www.acdsystems.com/ ACD Systems]) Tags ⇒ catergories, see: [https://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/XMP.html#acdsee ExifTool] / [http://www.exiv2.org/tags-xmp-acdsee.html Exif2.org]
* XMP digiKam Tags ⇒ Tag List, see: [https://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/XMP.html#digiKam ExifTool] / [http://www.exiv2.org/tags-xmp-digiKam.html Exif2.org]
* XMP dc ([http://dublincore.org/ Dublin Core]) Tags ⇒ Subject, see: [https://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/XMP.html#dc ExifTool] / [http://www.exiv2.org/tags-xmp-dc.html Exif2.org]
* XMP lr (Lightroom) Tags ⇒ hierarchicalSubject, see: [https://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/XMP.html#Lightroom ExifTool] / [http://www.exiv2.org/tags-xmp-lr.html Exif2.org]
* XMP [https://www.captureone.com/en/ MediaPro](MediaPro)Tags ⇒ Catalog Sets, see: [https://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/XMP.html#MediaPro ExifTool] / [http://www.exiv2.org/tags-xmp-mediapro.html Exif2.org]
* [https://docs.microsoft.com/en-us/windows/desktop/wic/-wic-codec-metadatahandlers Microsoft XMP] Tags ⇒ LastKeywordXMP, see: [https://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/Microsoft.html#XMP ExifTool] / [http://www.exiv2.org/tags-xmp-MP.html Exif2.org]
* [metadataworkinggroup.com MWG] Keywords Tags: [https://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/MWG.html#Keywords ExifTool] / [http://www.exiv2.org/tags-xmp-mwg-kw.html Exif2.org]


View Gilles’ profile
The following has a field but ignored by digiKam, why?
Gilles Caulier
* XMP [http://www.prismstandard.org/ prism] Tags ⇒ Keyword: [https://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/XMP.html#prism ExifTool]
Gilles Caulier  9:31 AM
but to be excluded shall:
3/ sure, the documentation about database already exists and need to be updated, but i don't understand the "word tag" problematic. I forget something ?
*XMP [https://www.adobe.com/devnet/xmp.html xmp] Tags, as it says ''non-standard'': [https://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/XMP.html#xmp ExifTool]
*XMP xmpMM Tags as it says ''undocumented'': [https://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/XMP.html#xmpMM ExifTool]
*XMP pdf Tags: [https://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/XMP.html#pdf ExifTool] -> only for Adobe PDF


Stefan Müller sent the following messages at 9:17 PM
I reckon there isn't any leading field as a mismatch could lead to an inconsistent search result, depending heavily on the application being used.
View Stefan’s profile
It needs to be clarified and documented why that does not happen any may be corrected.
Stefan Müller
Stefan Müller  9:17 PM
👍


Face workflow is to separate 2 topics : the coding space, and the GUI/test/review. I propose to manage the first and to delegate the second.👍
</ol>
 
 
If I'm correct, what is the source of the list of the people pane on the left? In my opinion there are three options.
1. First, these are the keywords listed below the hierarchy level persons in the keyword list. If the user selects an name it filters images based on the keywords and shows the face area as described in the person related metadata field.
2. Second, digiKam reads the information given in the person related fields of the metadata of each image in this particular case . Afterwards it uses this data to populate the person pane. That would be quite of workload on the CPU and isn't very likely.
3. Third, it stores the information given in person related fields of the metadata of each image in the database recognition.db. Based on the information stored there digiKam knows which images are to be shown. In this case, are the face thumbnails are stored in this database as well or are they derived from each image, based on the region information?
 
 
In addition I would like to see some changes in regard to the unkown faces thumbnails.
Those wishes are most likely discussed in other bug reports. For convenience I listed those created by woenx and mine again.
As you see most wishes are still unresolved and mine will mostly a duplicate of presents ones I'll list them anyway in order to highlight their necessity.
I would like to be able to
 
#    stop auto refresh of the thumbnails to avoid confirming a wrong face accidentality. It is a pain in the arse to undo such accidents
#    sort them at least by guessed faced.
#    It would be preferred if sorting in any view is possible by any property what can be used to filter items
#    drag and drop selected faces over an Person Name
#    assign Person Name via right click menu as possible for tags
#    group similar faces in "Unknown" faces
 
 
<ol>
<li> Bugs
  <ol>
  <li> [https://bugs.kde.org/show_bug.cgi?id=392013  <del>Bug 392013</del>]: Metadata explorer does not show XMP face rectangles
  <li> [https://bugs.kde.org/show_bug.cgi?id=392017  <del>Bug 392017</del>]: Merging, renaming and removing face tags
  <li> [https://bugs.kde.org/show_bug.cgi?id=392009  <del>Bug 392009</del>]: Weird automatic subtag within "Unknown people" called "da"
  <li> [https://bugs.kde.org/show_bug.cgi?id=392008 Bug 392008]: Inconsistent behaviour of "People" Tag
  </ol>
 
<li> Wishes
  <ol>
  <li> [https://bugs.kde.org/show_bug.cgi?id=275671 <del>Wish 275671</del>]: Scan single image for faces
  <li> [https://bugs.kde.org/show_bug.cgi?id=392015 Wish 392015]: Show "Unknown" faces in a more visible and preeminent place in the "People" list
  <li> [https://bugs.kde.org/show_bug.cgi?id=392007 Wish 392007]: Face tags and regular tags are mixed together and cannot be told apart
  <li> [https://bugs.kde.org/show_bug.cgi?id=392016 Wish 392016]: Confirmed and unconfirmed faces look the same in a person's face list
  <li> [https://bugs.kde.org/show_bug.cgi?id=392020 Wish 392020]: No possible way of knowing which pictures within a regular tag have been face-tagged
  <li> [https://bugs.kde.org/show_bug.cgi?id=392022 Wish 392022]: Position of a face tag appears on top or bottom of the list, instead of being sorted alphabetically
  <li> [https://bugs.kde.org/show_bug.cgi?id=392023 Wish 392023]: Feature request: add "Ignored" group of faces:
  <li> [https://bugs.kde.org/show_bug.cgi?id=392024 Wish 392024]: Feature request: group similar faces in "Unknown" faces  <br>
                      ⇒ [https://bugs.kde.org/show_bug.cgi?id=384396 Wish 384396]: Wish: display faces sorted by similarity (pre-grouped) instead of album/time/..
  <li> [https://bugs.kde.org/show_bug.cgi?id=386291 Wish 386291]: only refresh found face list/pane upon user request
  <li> [https://bugs.kde.org/show_bug.cgi?id=254099 Wish 254099]: SCAN : refresh collection with a script in commandline
    </ol>
</ol>

Latest revision as of 14:44, 10 March 2019

Introduction

Hello reader,
We begin with a little story, explaining how all the digiKam face recognition related features became a GSoC project.
All began in early 2018 as the thread either face recognition screen is buggy or I still don't understand it - at least I can say that more convenient bulk change of face tags (no auto refresh/set faces via context menu) is neccessary took off. Eventually, it found its course in early 2019 what convinced the maintainer of digiKam to refurbish these features earlier than originally considered. The post what made this change was written on the 01.Feb.2019 and describes quite well what has to be polished and redesigned, respectively. If you read the post, you will notice that it content goes beyond the pure face management workflow.


the overall face detection, recognition and management workflow

Before this article goes into the details, an overall description of all involved parts is given in corresponding order.

  1. the faces detection
    It is a group of algorithms to analyse the content of images, identify the distinctive regions such as eyes, nose, mouth, etc. Most of them are OpenCV based, and work mostly fine in the background (excepted some technical issues with OpenGL cards acceleration used by OpenCV which introduce instability, but it's another challenge). These algorithms generate region where a face can be found, typically a rectangle. These areas are written as digikam internal information in digiKams core database. That information will not be added to the metadata of the images yet as this happens during the face recognition workflow, what is explained further down.

  2. the faces detection
    This introduces the four different methods based on different algorithms, more and less functional. The goal is to be able to recognize automatically a non-tagged face from images, using previous face tags registered in the database. The algorithms are complex but explained in more details in the wiki page for the GSoC faces recognition project. The 4 different methods are explained here in brief only, a more detailed description can be found in Digikam/GSoC2019/AIFaceRecognition

    1. Deep Neural Network (DNN) Dlib C++ Library
      DigiKam has already an experimental implementation of Neural Network to perform faces recognition what is rather proof of concept than a production-ready function. This DNN is based on the Dlib implementation in OpenFace project.

    2. OpenCV - Local Binary Patterns Histograms (LBPH)
      This is the most complete implementation of a face detection algorithm. Moreover, it is the oldest implementation of such an algorithm in digiKam. It's not perfect and requires at least six faces already tagged manually by the user to identify the same faces in non-tagged images.

    3. OpenCV - Eigen Faces
      An alternative algorithm what uses the OpenCV backend. It was introduced to have a different source of results for face detection, enabling to proof the DNN approaches.

    4. OpenCV - Fisher Face
      Another algorithm what uses the OpenCV backend. It was introduced for the same purposes as Eigen Faces.
      According to rumours, this one is not finalized, it is said that not all methods are implemented.

  3. The faces workflow
    This is the actual subject of this article where the search for a student(s) for the GSoC 2019 is ongoing. There are not any complex algorithms involved here.
    That is where we switch from the backend, the digiKam core, to the frontend, the GUI. There are numberless posts what could be improved or what is missing. The goal is to answer all those to make the entire workflow flawless, allowing to be widely accepted and enjoyed by the users. This requires some significant effort to assist and guide the student(s) to achieve the desired outcome. As that is related to coding the maintainers would like to so the community to take over here, to take off workload from them and enabling us users to steer the process from a user perspective. The maintainer would only ensure the quality of the code.
    The overall face workflow will not change that much, the changes are mainly under the hood, as mentioned in the chapter above. The process is
    1. Detect
    2. suggest faces
    3. user confirms / correct

    but there are many ways to achieve this. That is the place where the hard work begins. The following section tries to give guidance to the entire retrofit process aiming at collecting, outlining and streamlining all suggestions to ensure consistency and intuitive face workflow.

    I mentioned that, since as far as I know that content of person-related metadata fields are not taken into account when you search or filter a collection by certain keywords. Thus, in order to make the names findable by digiKam, the name has to be added to the keywords related metadata fields to make the magic happen.

Participation

This is a break-down fo the description of how to participate in the Summer of Code program with KDE.

  1. AS A MAINTAINER
    As a maintainer, you are responsible to know the digiKam to source code and check pull requests of the student(s) before they are merged into the master. In addition, you are the contact person for questions of the student(s) in regard to the code and ensure that the student(s)’s documentation is satisfactory.

  2. A MENTORING USER
    If you wish you more than welcome to contact any of the current users, get you an account on kde and joining the discussion here, in the mailing list and begging contributing to this article.
    Volunteered users are:
    Stefan Müller (user coordinator)

  3. AS A STUDENT
    Typically, the student must review all related Bugzilla entries given in the corresponding Bugzilla section of the project. If this project or the Bugzilla does not provide enough guidance, the student(s) must identify the top level entries to engage but with help by the listed mentors. The student is expected to work autonomous technically-wise, so the answers to challenges will not be found independently of the support of the maintainer. This does not mean that the maintainers cannot be reached by the student. Guidance will be given at any time in any case but shall that be limited to occasional situations to allow the maintainers to follow up on their work.
    Regardless of the above-mentioned channel of communication, the maintainers review and validate the code in their development branch bevor merging it to the master branch. Besides coding, it is required to submit a technical proposal, wherein is to list :
    • the problematic,
    • the code outlining, being merged into the master branch
    • the tests
    • the overall project plan for this summer,
    • documentation to write (mostly in code), etc.

Project tasks

All relevant bug reports can be found in

In the following is it tried to group them in major tasks, to give the students detailed guidance on how to close the bug reports

 
Under Construction
This is a new page, currently under construction!

latest email converstation not refelected yet

  1. SEPARATION BETWEEN TAGS AND FACES (by Stefan Müller)
    Many players in the media business, such as Adobe, use the expression tag for anything related to metadata others separating between the different types of metadata.
    All metadata records are stored in fields (see e.g. photometadata.org) which also often called tags (of the metadata), so a tag is anything that is used in digiKam to filter or search for images, e.g. keywords, colour label, star rating etc... Thus there is to much space for interpretation what leads to all these questions due to irritations caused by the use of the word tag. In order to lower the entry hurdle into the world of tagging I would suggest to be consistent with the official wording, thus new users won't be confused by this. That means that the text for the tag will be named keyword, so on the source selection pane on the left will be Keywords and in filter pane on the right, it will say Keywords Filter. The description shall rather say close to digital deals with metadata, grouped in (tags of): keywords, label, date and location.

  2. Ensure that all relevant metadata fields are filled (by Stefan Müller)
    At the end, as soon as a name is confirmed digiKam writes the data to the MP and MWG namespace of the XMP records, it sets a name and area.
    More Details about those namespaces can be found here: as Apple and Adobe write their information in the MWG namespace, I would say that MWG is the leading namespace but inconsistent may lead to unexpected behaviour of the applications what reads them. In my understanding, this information should also be written to the IPTC Person structure as mentioned in the IPTC Photo Metadata User Guide (Persons Depicted in the Image), but is not. It needs to be clarified and documented why that does not happen and may be corrected. Link face region with face name properly In order to make images findable by a person's name, the name shall also be written to the keywords field of multiple namespaces, IPTC Photo Metadata User Guide (Persons Depicted in the Image) recommends caption and keywords. I cannot tell all relevant fields/namespaces. My research tells me that should be at least those: The following has a field but ignored by digiKam, why? but to be excluded shall:
    • XMP xmp Tags, as it says non-standard: ExifTool
    • XMP xmpMM Tags as it says undocumented: ExifTool
    • XMP pdf Tags: ExifTool -> only for Adobe PDF
    I reckon there isn't any leading field as a mismatch could lead to an inconsistent search result, depending heavily on the application being used. It needs to be clarified and documented why that does not happen any may be corrected.


If I'm correct, what is the source of the list of the people pane on the left? In my opinion there are three options. 1. First, these are the keywords listed below the hierarchy level persons in the keyword list. If the user selects an name it filters images based on the keywords and shows the face area as described in the person related metadata field. 2. Second, digiKam reads the information given in the person related fields of the metadata of each image in this particular case . Afterwards it uses this data to populate the person pane. That would be quite of workload on the CPU and isn't very likely. 3. Third, it stores the information given in person related fields of the metadata of each image in the database recognition.db. Based on the information stored there digiKam knows which images are to be shown. In this case, are the face thumbnails are stored in this database as well or are they derived from each image, based on the region information?


In addition I would like to see some changes in regard to the unkown faces thumbnails. Those wishes are most likely discussed in other bug reports. For convenience I listed those created by woenx and mine again. As you see most wishes are still unresolved and mine will mostly a duplicate of presents ones I'll list them anyway in order to highlight their necessity. I would like to be able to

  1. stop auto refresh of the thumbnails to avoid confirming a wrong face accidentality. It is a pain in the arse to undo such accidents
  2. sort them at least by guessed faced.
  3. It would be preferred if sorting in any view is possible by any property what can be used to filter items
  4. drag and drop selected faces over an Person Name
  5. assign Person Name via right click menu as possible for tags
  6. group similar faces in "Unknown" faces


  1. Bugs
    1. Bug 392013: Metadata explorer does not show XMP face rectangles
    2. Bug 392017: Merging, renaming and removing face tags
    3. Bug 392009: Weird automatic subtag within "Unknown people" called "da"
    4. Bug 392008: Inconsistent behaviour of "People" Tag
  2. Wishes
    1. Wish 275671: Scan single image for faces
    2. Wish 392015: Show "Unknown" faces in a more visible and preeminent place in the "People" list
    3. Wish 392007: Face tags and regular tags are mixed together and cannot be told apart
    4. Wish 392016: Confirmed and unconfirmed faces look the same in a person's face list
    5. Wish 392020: No possible way of knowing which pictures within a regular tag have been face-tagged
    6. Wish 392022: Position of a face tag appears on top or bottom of the list, instead of being sorted alphabetically
    7. Wish 392023: Feature request: add "Ignored" group of faces:
    8. Wish 392024: Feature request: group similar faces in "Unknown" faces
      Wish 384396: Wish: display faces sorted by similarity (pre-grouped) instead of album/time/..
    9. Wish 386291: only refresh found face list/pane upon user request
    10. Wish 254099: SCAN : refresh collection with a script in commandline