Projects/Nepomuk/Akademy 2012 BOF

Nepomuk BOF

This is the summary of the BOF that took place on 4th July 2012 during Akademy 2012. The title of the BOF was "Constructive Criticism: Prioritizing Nepomuk development". Some of the notably attendees were -

Vishesh Handa (Nepomuk - host)
Martin Klapetek (Telepathy)
David Edmenson (Telepathy)
Martin Graesslin (KWin)
Alex Fiestas (Solid)
Marco Martin (Plamsa)
Christian Mollekopf (PIM)

( Please add your self to the list, if I have missed your name )

There is more user oriented summary and pictures of what was discussed over here - http://vhanda.in/blog/2012/06/the-nepomuk-bof/

Topics Discussed

PIM Feeders

The feeders regularly cause the entire system to consume massive amount of CPU.

Suggested improvements -

Throttle the indexing code
Schedule it during low cpu consumption
storeResources is quite slow - optimize it

For better detection of 'low cpu consumption' and when exactly the feeders should be run, it was discussed that solid should provide some simple signals which inform clients who are interested in running cpu intensive tasks. This should probably be combined with `KIdleTime`

Another thing that was discussed was the feeling of control. The indexing is not controllable and happens in the background. There should be way for users to pause/resume the indexing. Maybe even ask for re-indexing?

Dolphin

Dolphin is arguably the most user visible application that uses Nepomuk. It has decent Nepomuk integration, but needs a lot of polishing up -

Only list files
On-demand loading - For results > 1000 (random number), do not load all the results. Just load a small subset, and when/if the user scrolls more, then load them. This seems like a nice idea, but it is going to be really hard to implement. Mainly cause of the nature of the kioslaves. Any suggestions on how to go about this are welcome.
Improve searching user interface
- The user shouldn't have to choose between filename and contents. It should automatically prioritize the result. Or maybe it would be nice to allow a user to search both via the filename + content. Needs more discussion.
- Provide a clear indication when it is using Nepomuk, and when it is performing a manual search.

User Visibility

Kickoff integration
QML Widget?
Crystal?

There were also long discussions on krunner integration and how (maybe?) krunner should be improved. However, this was not the appropriate place for that discussion, so we moved on to another topic. Maybe this should be taken up during some Plasma meeting?

vHanda: It wouldn't be more than a days work for me to clean up Crystal and get it working efficiently. Not sure about the port to QML.

Technical Documentation

Most of the technical documentation is extremely outdated. It's no surprise that developers are scared of Nepomuk.

Additionally, not everyone understand the ontologies and we cannot expect application developers to understand how to write their own ontologies. Example - The Solid developers would love to be able to associate BlueTooth accounts with people and the files received. Same is the case with Telepathy File Transfer.

Old Data cleaner + Migrator

There is a lot of invalid and old data lying in Nepomuk. They should be some special application to remove and port that invalid data. It should typically be done in another application cause that way the user can decide when to run it, and they'll understand if it is consuming a lot of system resources.

Something along the lines of - https://bugs.kde.org/show_bug.cgi?id=293323

Metadata writeback

Should be shipped. Gives the user a better sense of security. This could even be extended by using file system attributes.

There was GSOC 2011 project called Metadata Writeback, but most of it is still lying in a branch in kde-runtime. It needs to be moved to nepomuk-core, and ported to Nepomuk2.

From what I remember, it only supported ID3 tags and Akonadi writeback.

Nepomuk should be hidden

Nepomuk isn't a client side application. The users do not need to know about its existence. We have gone quite far accomplishing this, but there is a lot more places where this could be improved.

Example: One prime application is called "Nepomuk Controller"

User visible query language

The query language is for power users and is not publicly documented. Additionally, the code is also fairly messy and difficult to modify. There was a gsoc idea for fixing this, but we never got decent students.

Globalized WebExtrator

Applications want the ability to fetch extra metadata about their resources. Instead of each application doing this themselves, we should provide some convenient API for applications to use. Probably a plugin based system along with it.

Prime use case: Plasma Media Center

There was a GSOC 2010 Project - Web-Extractor. But the gsoc resulted in this huge mess of over-engineered code. It would be a lot simpler to just start from scratch. The API is the only part that needs some discussion, the rest should be pretty straightforward.

There seems to be some new python based plugin work going on in a scratch repository - http://quickgit.kde.org/index.php?p=nepomuk-metadata-extractor.git&a=summary

Digikam Integration

Needs to be written from scratch. The current situation is that it runs a special Nepomuk Digikam service, and continuously monitors the Nepomuk repository for changes.

The main roadblock in implementing this is the absence of proper documentation outline the exact ontology that would be used for pictures, albums and face tagging.

Final Notes

If I've missed some important detail, please add it. We can review this in about a month, and see what progress has been made.