A collection of software components for natural language processing (NLP) based on the Apache UIMA framework. Many NLP tools are already freely available in the NLP research community. DKPro Core provides Apache UIMA components wrapping these tools (and some original tools) so they can be used interchangeably in UIMA processing pipelines. DKPro Core builds heavily on uimaFIT which allows for rapid and easy development of NLP processing pipelines, for wrapping existing tools and for creating original UIMA components.

Programming language: Java
License: GNU General Public License v3.0 or later
Latest version: v2.1.0

  • CoreNLP

    CoreNLP: A Java suite of core NLP tools for tokenization, sentence segmentation, NER, parsing, coreference, sentiment analysis, etc.
  • Mallet

    MALLET is a Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text.
  • CogCompNLP

    CogComp's Natural Language Processing Libraries and Demos: Modules include lemmatizer, ner, pos, prep-srl, quantifier, question type, relation-extraction, similarity, temporal normalizer, tokenizer, transliteration, verb-sense, and more.
  • LingPipe

    Toolkit for a variety of tasks ranging from POS tagging to sentiment analysis.
  • DKPro

    Collection of reusable NLP tools for linguistic pre-processing, machine learning, lexical resources, etc.

