Baloo: Difference between revisions

From KDE Community Wiki
(→‎Indexing limitations: mention kmimetypefinder5 and the glib bug that changes HTML mime types)
(→‎Indexing limitations: link to non-UTF-8 bug report)
Line 21: Line 21:
Other limitations:
Other limitations:
* Baloo doesn't index text files (those whose MIME type is detected as "text/''something''") over 10 MB ([https://invent.kde.org/frameworks/baloo/-/blob/master/src/file/extractor/app.cpp#L143 source]).
* Baloo doesn't index text files (those whose MIME type is detected as "text/''something''") over 10 MB ([https://invent.kde.org/frameworks/baloo/-/blob/master/src/file/extractor/app.cpp#L143 source]).
* The KFileMetadata extractor for text attempts to convert text to Unicode. If the file uses another encoding, such as iso-8859-1, any file contents after the first character that is invalid in Unicode will not be indexed. You may find the <code>-i</code> option to the <code>file</code> command-line utility useful; it tries to infer the character set of a file, e.g. <kbd>file -i ''path/to/myfile.txt''</kbd>.
* The KFileMetadata extractor for text attempts to convert text to Unicode. If the file uses another encoding, such as iso-8859-1, any file contents after the first character that is invalid in Unicode will not be indexed ([https://bugs.kde.org/show_bug.cgi?id=439857 bug 439857]). You may find the <code>-i</code> option to the <code>file</code> command-line utility useful; it tries to infer the character set of a file, e.g. <kbd>file -i ''path/to/myfile.txt''</kbd>.


== Other Baloo pages here ==
== Other Baloo pages here ==

Revision as of 00:14, 17 June 2022

Help Konqi find what he wants!

Baloo is the file indexing and file search framework for KDE Plasma, with a focus on providing a very small memory footprint along with with extremely fast searching.

Ways to communicate

Mailing List: [email protected] (info page)
IRC Channel: #kde-devel on freenode
Phabricator project: https://phabricator.kde.org/project/view/261

Top bugs and feature requests

Bugs: https://bugs.kde.org/buglist.cgi?bug_severity=critical&bug_severity=grave&bug_severity=major&bug_severity=crash&bug_severity=normal&bug_severity=minor&bug_status=UNCONFIRMED&bug_status=CONFIRMED&bug_status=ASSIGNED&bug_status=REOPENED&list_id=1629910&priority=VHI&priority=HI&product=frameworks-baloo&query_format=advanced

Feature requests: https://bugs.kde.org/buglist.cgi?bug_severity=wishlist&bug_status=UNCONFIRMED&bug_status=CONFIRMED&bug_status=ASSIGNED&bug_status=REOPENED&list_id=1629911&priority=VHI&priority=HI&product=frameworks-baloo&query_format=advanced

Indexing limitations

Baloo uses the file metadata extractors in KFileMetadata to get information about each file it indexes. This means for a file's content to be indexed

  • the file must have a recognizable MIME type
  • KDE must have an extractor for that MIME type. Use the command-line utility kmimetypefinder5 to determine a file's mime type.
    • Due to a glib bug, the MIME type of HTML files can change from text/html to application/x-extension-html. The KDE file metadata extractors don't recognize the latter. That bug has a workaround to reset the MIME types to the usual values.

Other limitations:

  • Baloo doesn't index text files (those whose MIME type is detected as "text/something") over 10 MB (source).
  • The KFileMetadata extractor for text attempts to convert text to Unicode. If the file uses another encoding, such as iso-8859-1, any file contents after the first character that is invalid in Unicode will not be indexed (bug 439857). You may find the -i option to the file command-line utility useful; it tries to infer the character set of a file, e.g. file -i path/to/myfile.txt.

Other Baloo pages here

Information may be obsolete.

Using Baloo

Baloo is not an application, but a daemon to index files. Applications can use the Baloo framework to provide file search results. For example, Dolphin's Content search can use Baloo.

KDE System Settings > File Search provides an intentionally limited number of settings. You can make additional adjustments in Baloo's configuration file.

balooctl

balooctl is a CLI command to perform certain operations on Baloo. Enter balooctl --help in a terminal app such as userbase:Konsole to list its available subcommands.

See also Baloo/Debugging.