User:Unormal: Difference between revisions
Appearance
Added some links |
→Free Linguistic software tools and framework: Added BabelNet |
||
Line 62: | Line 62: | ||
| | | | ||
|GPL | |GPL | ||
| | |||
|- | |||
|http://lcl.uniroma1.it/babelnet/ BabelNet | |||
|English, Catalan, French, German, Italian and Spanish | |||
|A very large multilingual semantic network | |||
|1.0 | |||
|Java | |||
| | |||
| | | | ||
|- | |- |
Revision as of 09:25, 12 July 2012
NLP - Natural Language Programming/Processing in KDE
- Jovie/KTTS - KDE Text-To-Speech
- Sonnet (Spell checking, etc.)
- Simon Listens - Speech Recognition
- KDEedu (Parley, KWordQuiz, KHangman, etc.)
- KMail ("attachment" recognition)
Theory
In NLP we've the following tasks to do:
- Look up - look up in a directionary (to find an antonym or synonym or definition)
- Machine translation - translate a text or word from one language to another
- Parsing - extract specific information from a word or text
- Part-of-Speech-Tagging - Search to corresponding part-of-speech (pos) tag for a word (there are different pos tag sets)
- Segmentation/Tokenization - Split the text or sentence by words or sentences
- Spell checking - check to correct writing of a word or text
- Stemming - Extract the stem of a word
Free Linguistic software tools and framework
Tool | Supported Languages | Type | Version | Programming language | License | Notes |
---|---|---|---|---|---|---|
Apertium | many | machine translation platform | 3.1 | GPL | ||
Aspell | many ;-) | spell checker | 0.61 | LGPL | successor of Ispell | |
Enchant | many | Spell checker | 1.6.0 | Spell checker for Abiword | ||
FreeLing | Spanish, Catalan, Galician, Italian, English, Welsh, Portuguese, and Asturian | suite of language analyzers | 2.2 | GPL | ||
http://lcl.uniroma1.it/babelnet/ BabelNet | English, Catalan, French, German, Italian and Spanish | A very large multilingual semantic network | 1.0 | Java | ||
frog | Dutch | tagger and parser | 0.1 | C++, Python | GPLv3 | |
hspell | Hebrew | spell checker (and morphological analyzer) | GPL | |||
hunmorph | morphological analyer | More nlp tools at this page | ||||
hunpos | tagger | 1.0 | OCaml | BSD | ||
HunSpell | many | spell checker and morphological analyzer | 1.2.12 | C, C++ | LGPL & MPL | Spell checker of OOo |
Ispell | large number of European languages | spell checker | 3.3.02 | unknown | Probably deprecated | |
liblingua-tagger | English | tagger | 0.16 | Perl | Unknown | Liblingua in Perl's CPAN module provides more tagger and stemmers |
LinkGrammar | English | syntactic parser | 4.1b | GPL | ||
Malaga | German, Italian, Spanish, Suomi (not all free!) | grammar development environment | 7.12 | GPL | ||
mbt | Memory-based tagger-generator and tagger | 3.2.2 | C++ | GPLv3 | ||
Morphisto | German | morphological analyzer | LGPL & CC | |||
MySpell | many | spell checker | Former spell checker of OOo, now deprecated | |||
nltk | Natural Language ToolKit | Python | ||||
OpenNLP | unknown | NLP software collection at Apache | Java | Apache License | ||
Stanford Log-linear Part-Of-Speech Tagger | tagger | Java | GPL | |||
TiMBL | Tilburg Memory Based Learner | 1.0.0 | C++ | GPLv3 | ||
Snowball | many | stemmer library | C, Java, Python | BSD | ||
TreeTagger | many | PoSTagger & lemmatizer | Tagger License |
Free text to speech tools
Tool | Supported Languages | Type | Version | Programming language | License | Notes |
---|---|---|---|---|---|---|
Festival | ||||||
MBrola | ||||||
Additional tools
- Foma - a finite-state machine toolkit and library
- SFST - Stuttgart Finite State Transducer - a toolbox for the implementation of morphological analysers and other tools which are based on finite state transducer technology
Semantics and Co
- LexInfo builds on the lemon model to represent lexical information attached to ontologies on the semantic web
- GOLD is an ontology for descriptive linguistics
Futher stuff
- LIMA