Projects/Nepomuk/FileIndexing

Jump to: navigation, search

This page attempts to catalogue the list of files formats Nepomuk supports, and what formats are remaining.

Mime Types

MimeType Status Plugin Comments
image/jpeg Testing Exiv2Extractor No Comments
image/png Testing Exiv2Extractor -
image/gif  ?  ?
image/exif
image/tiff
image/bmp
image/svg
audio/mpeg Requires Polish Taglib Extractor
audio/mp4
audio/wav
audio/x-aiff
application/pdf Implemented - Requires Testing PopplerExtractor ---
Other Office Formats  ?
Ebook Formats  ?
Archives  ?
video/mpeg Testing FFmpeg
video/x-msvideo Testing FFmpeg
Other video formats  ?
text/plain Plain Text Extractor Implemented This should be extended to support other text files

Notes

Documents

Microsoft Formats

DOC - OLE 2 Compound Document and Office Open XML - Custom parser by Strigi. What can we use? <br\> XSL - http://qt-project.org/wiki/Handling_Microsoft_Excel_file_format <br\> spreadsheet formats <br\>

Maybe we can use some libreoffice or calligra libraries?

Open document formats

ODF - Strigi had their own inbuilt. What are our options?

Ebook formats

  • epub - Strigi reuses their ODF parser for epub. We could use libepub
  • mobi
  • rtf
  • lrf

Checkout what Okular uses for all these files and use that.

Other

  • lyx
  • tex
  • cbz - Comic books

Archives

We just need to add the nfo:Archive type based on the mimetype. Is there anything else that we can add?

Emails

  • mbox format - How? Something from pim?

This page was last modified on 6 November 2012, at 01:23. Content is available under Creative Commons License SA 4.0 unless otherwise noted.