Contents 1 NLP - Natural Language Programming/Processing in KDE 1.1 Theory 1.2 Free Linguistic software tools and framework 1.3 Free text to speech tools 1.3.1 Additional tools 1.4 Semantics and Co 1.4.1 Futher stuff NLP - Natural Language Programming/Processing in KDE Jovie/KTTS - KDE Text-To-Speech Sonnet (Spell checking, etc.) Simon Listens - Speech Recognition KDEedu (Parley, KWordQuiz, KHangman, etc.) KMail ("attachment" recognition) Theory In NLP we've the following tasks to do: Look up - look up in a directionary (to find an antonym or synonym or definition) Machine translation - translate a text or word from one language to another Parsing - extract specific information from a word or text Part-of-Speech-Tagging - Search to corresponding part-of-speech (pos) tag for a word (there are different pos tag sets) Segmentation/Tokenization - Split the text or sentence by words or sentences Spell checking - check to correct writing of a word or text Stemming - Extract the stem of a word Free Linguistic software tools and framework Tool Supported Languages Type Version Programming language License Notes Apertium many machine translation platform 3.1 GPL Aspell many ;-) spell checker 0.61 LGPL successor of Ispell Enchant many Spell checker 1.6.0 Spell checker for Abiword FreeLing Spanish, Catalan, Galician, Italian, English, Welsh, Portuguese, and Asturian suite of language analyzers 2.2 GPL BabelNet English, Catalan, French, German, Italian and Spanish A very large multilingual semantic network 1.0 Java Creative Commons Attribution-Noncommercial-Share Alike 3.0 DGT-TM Bulgarian, Czech, Danish, Dutch, English, Estonian, German, Greek, Finnish, French, Hungarian, Italian, Latvian, Lithuanian, Maltese, Polish, Portuguese, Romanian, Slovak, Slovene, Spanish and Swedish. A freely available large-scale translation memory in 22 languages Java EUPL frog Dutch tagger and parser 0.1 C++, Python GPLv3 hspell Hebrew spell checker (and morphological analyzer) GPL hunmorph morphological analyer More nlp tools at this page hunpos tagger 1.0 OCaml BSD HunSpell many spell checker and morphological analyzer 1.2.12 C, C++ LGPL & MPL Spell checker of OOo Ispell large number of European languages spell checker 3.3.02 unknown Probably deprecated LanguageTool many style and grammar proofreading software 1.8 Java LGPL liblingua-tagger English tagger 0.16 Perl Unknown Liblingua in Perl's CPAN module provides more tagger and stemmers LinkGrammar English syntactic parser 4.1b GPL Link Grammar Parsre English and more syntactic parser 4.7.6 Probably C GPL Malaga German, Italian, Spanish, Suomi (not all free!) grammar development environment 7.12 GPL mbt Memory-based tagger-generator and tagger 3.2.2 C++ GPLv3 Morphisto German morphological analyzer LGPL & CC MySpell many spell checker Former spell checker of OOo, now deprecated nltk Natural Language ToolKit Python OpenNLP unknown NLP software collection at Apache Java Apache License Stanford Log-linear Part-Of-Speech Tagger tagger Java GPL TiMBL Tilburg Memory Based Learner 1.0.0 C++ GPLv3 Snowball many stemmer library C, Java, Python BSD SVM English, Catalan, Spanish An Open Source generator of sequential taggers 1.3.2 Perl LGPL TreeTagger many PoSTagger & lemmatizer Tagger License Standford list of NLP tools Free text to speech tools Tool Supported Languages Type Version Programming language License Notes Festival MBrola Additional tools Foma - a finite-state machine toolkit and library SFST - Stuttgart Finite State Transducer - a toolbox for the implementation of morphological analysers and other tools which are based on finite state transducer technology Semantics and Co LexInfo builds on the lemon model to represent lexical information attached to ontologies on the semantic web GOLD is an ontology for descriptive linguistics Futher stuff LIMA Retrieved from "https://community.kde.org/index.php?title=User:Unormal&oldid=22716" This page was last edited on 12 July 2012, at 21:31. Content is available under Creative Commons License SA 4.0 unless otherwise noted.