Better File Index Scheduling - The current index scheduler operates only on the basis of battery and idle time. It could be improved by taking the number of files that need to indexed into consideration.
More Indexers - We're lacking indexers for various formats. Some that come to mind are - djvu, chm, comic books, and gifs
Unit tests for indexers - Some of our indexers just use libraries while others such as office formats have their own parsers. It would be awesome to have some unit tests for these formats.
BalooCtl tool - A simple tool called balooctl to start/stop the baloo_file process. It could also have a 'reindex' function which would nuke the entire file index.
Milou Search Application - Milou is an awesome plasmoid to search for things, but it could really use a dedicated search application as well. Specially for the cases when the user cannot find the required info in the plasmoid.
Spelling Correction - The xapian backend offers good spelling correction, we should try and expose this through the search api.
Support for XMP Metadata in KFileMetaData
Custom Xapian word tokenizer and query parser - We're currently using xapian's internal ones. It would be nice to have our own which we can modify to split by additional characters such as _
KFileMetaData dublic core conversion - Currently there is a lot of code duplicated among different analyzers for converting dublin core metadata. This could be integrated into one class. We could also try and support more of the common dublin core specifications.
KFileMetaData - Support for writeback. We currently have plugins which extract metadata from files. It would be awesome if we could also provide plugins to do the reverse.
Removable Media Support - It was be very nice to index the data in removable media and store the indexed information in the removable medium.