All Versions
9
Latest Version
Avg Release Cycle
101 days
Latest Release
685 days ago

Changelog History

  • v2.1.0 Changes

    December 01, 2019

    🚀 We are pleased to announce the release of

    DKPro Core 2.1.0

    a collection of interoperable software components for natural language processing (NLP) based on the Apache UIMA framework version 3.

    https://dkpro.github.io/dkpro-core

    🚀 This is a feature release.

    Notable changes since DKPro Core 2.0.0

    • ➕ Added option to export XMI using XML 1.1 to avoid issues with certain characters
    • ➕ Added option to CoNLL readers to trim off whitespace from field values to avoid users having issues with incidental space characters (default is on)
    • ➕ Added support for annotator notes in brat format
    • 👌 Improved speed for writing WebAnno TSV format (backported from WebAnno)
    • 🛠 Fixed a couple of issues with the CoNLL 2012 format
    • 🛠 Fixed default extension for CoNLL-U writer
    • 🛠 Fixed problem in CoNLL-U writer when text contains line breaks
    • 🛠 Fixed problem that LanguageToolChecker did not fill in suggestions
    • 🛠 Fixed setting div type on paragraphs created by CoNLL-U reader

    🚀 A more detailed overview of the changes in this release can be found [2].

    Thanks to all contributors!

    ⬆️ When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.

    🚀 [1] https://github.com/dkpro/dkpro-core/releases/tag/rel%2Fdkpro-core-2.1.0
    [2] https://github.com/dkpro/dkpro-core/issues?q=milestone%3A2.1.0

  • v2.0.0 Changes

    September 08, 2019

    🚀 We are pleased to announce the release of

    DKPro Core 2.0.0

    a collection of interoperable software components for natural language processing (NLP) based on the Apache UIMA framework.

    https://dkpro.github.io/dkpro-core

    🚀 This is a feature release.

    ⬆️ Important upgrade notice

    This version requires UIMA v3.

    ⬆️ If you are upgrading from DKPro Core 1.10.x or earlier, please read the DKPro Core 1.11.0 upgrade notice [1].

    Notable changes since DKPro Core 1.11.1

    • Switched to UIMAv3
    • ➕ Added filling in suggestions to LanguageToolChecker
    • ➕ Added support for notes to BratReader
    • ➕ Added basic read support for Perseus XML format
    • 👌 Improved error message when StanfordNamedEntityRecognizerTrainer is called without training data
    • 🚚 Moved StopwordRemover to tokit module and removed stopwordremover module
    • 📇 Renamed lancaster module to smile
    • ✂ Removed Tag type from syntax module
    • ... and a few additional under-the-hood changes

    🚀 A more detailed overview of the changes in this release can be found [2].

    Thanks for contributions go to: @alaindesilets, @mischor

    ⬆️ When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.

    🚀 [1] https://github.com/dkpro/dkpro-core/releases/tag/rel%2Fdkpro-core-2.0.0
    [2] https://github.com/dkpro/dkpro-core/issues?q=milestone%3A2.0.0

  • v1.12.0 Changes

    December 01, 2019

    🚀 We are pleased to announce the release of

    DKPro Core 1.12.0

    a collection of interoperable software components for natural language processing (NLP) based on the Apache UIMA framework version 2.

    https://dkpro.github.io/dkpro-core

    🚀 This is a feature release.

    ⬆️ Important upgrade notice

    ⬆️ If you are upgrading from DKPro Core 1.10.x or earlier, please read the DKPro Core 1.11.0 upgrade notice [1].

    Notable changes since DKPro Core 1.11.1

    • ➕ Added option to export XMI using XML 1.1 to avoid issues with certain characters
    • ➕ Added option to CoNLL readers to trim off whitespace from field values to avoid users having issues with incidental space characters (default is on)
    • ➕ Added support for annotator notes in brat format
    • 👌 Improved speed for writing WebAnno TSV format (backported from WebAnno)
    • 🛠 Fixed a couple of issues with the CoNLL 2012 format
    • 🛠 Fixed default extension for CoNLL-U writer
    • 🛠 Fixed problem in CoNLL-U writer when text contains line breaks
    • 🛠 Fixed problem that LanguageToolChecker did not fill in suggestions

    🚀 A more detailed overview of the changes in this release can be found [2].

    Thanks to all contributors!

    ⬆️ When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.

    🚀 [1] https://github.com/dkpro/dkpro-core/releases/tag/rel%2Fdkpro-core-1.12.0
    [2] https://github.com/dkpro/dkpro-core/issues?q=milestone%3A1.12.0

  • v1.11.1 Changes

    August 17, 2019

    🚀 We are pleased to announce the release of

    DKPro Core 1.11.1

    a collection of interoperable software components for natural language processing (NLP) based on the Apache UIMA framework.

    https://dkpro.github.io/dkpro-core

    🛠 This is a bugfix release.

    ⬆️ Important upgrade notice

    ⬆️ If you are upgrading from DKPro Core 1.10.x or earlier, please read the DKPro Core 1.11.0 upgrade notice [1].

    Notable changes since DKPro Core 1.11.0

    • 🛠 Fixed trimming of whitespace at the start and end of annotations
    • 🛠 Fixed encoding of named entity categories in LIF format
    • 🛠 Fixed unescaping of URI-encoded characters when writing files
    • ➕ Added parameter to control whitespace normalization in HtmlDocumentReader
    • ➕ Added parameters to control indentation and output method in XmlDocumentWriter
    • 👌 Improved exception in Stanford CoreNLP NER trainer when no documents have been processed

    🚀 A more detailed overview of the changes in this release can be found [2].

    Thanks for contributions go to: @az79nefy, @ramonziai, @manuelciosici, @Horsmann, @tilmanbeck, @alaindesilets, @jcklie

    ⬆️ When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.

    🚀 [1] https://github.com/dkpro/dkpro-core/releases/tag/dkpro-core-1.11.0
    [2] https://github.com/dkpro/dkpro-core/issues?q=milestone%3A1.11.1

  • v1.11.0 Changes

    July 05, 2019

    🚀 We are pleased to announce the release of

    DKPro Core 1.11.0

    a collection of interoperable software components for natural language processing (NLP) based on the Apache UIMA framework.

    https://dkpro.github.io/dkpro-core

    🚀 This is a feature release.

    ⬆️ Important upgrade notice

    • 🔄 Changed groupIds and artifactIds. The group ID is now org.dkpro.core and the artifact IDs are dkpro-core-...-(asl/gpl)
    • 🔄 Changed package names. The packages are now all starting with org.dkpro.core... - except the packages of UIMA types which remain unchanged for data compatibility.

    Notable changes since DKPro Core 1.10.0

    • 🔄 Changed parts of the brat data conversion code such that it can be more easily used outside a UIMA component
    • 🔄 Changed type mapping such that out-of-tagset types map to the generic type (e.g. an unknown POS tag maps to POS, not to POS_X)
    • 🔄 Changed name of NYTCollectionReader to NitfReader
    • ➕ Added types to encode XML document structure in CAS
    • ➕ Added new XmlDocumentReader/Writer components using these types
    • ➕ Added basic reader for Annotated Gigaword corpus (only reads text so far) (thanks @az79nefy)
    • ➕ Added basic support for PubAnnotation JSON format
    • ➕ Added Maui component for keyword assignment
    • ➕ Added parameter to SfstAnnotator to enable lower-case lookup of first word in a sentence (thanks @rziai)
    • ➕ Added "order" feature to Token type
    • ➕ Added support for CoNLL-U document and paragraph IDs (thanks @manuelciosici)
    • ➕ Added support for CoNLL-U sentence IDs and text
    • ➕ Added standardized parameter to disable type mapping
    • ➕ Added support for TCF orthography layer using SofaChangeAnnotations
    • ➕ Added segmenter for Chinese using jieba (thanks @Horsmann)
    • ➕ Added MyStem for Russian
    • ➕ Added links to OpenMinTeD categories in type system documentation
    • ➕ Added support for the reading/writing the CoreNLP CoNLL flavor
    • ➕ Added parameter to configure the Tika buffer size (useful for large documents)
    • ⚡️ Updated to OpenNLP 1.9.1
    • ⚡️ Updated to CoreNLP 3.9.2
    • ⚡️ Updated to ICU4J 64.2
    • ⚡️ Updated to Tika 1.19.1
    • ⚡️ Updated to LanguageTool 4.3
    • ⚡️ Updated to PDFBox 2.0.12
    • ⚡️ Updated IllinoisNLP components
    • ⚡️ Updated TreeTagger models/binaries in build.xml script (thanks @tilmanbeck)
    • ⚡️ Updated LIF dependencies
    • ⚡️ Updated dataset descriptions
    • ⚡️ Updated various general dependencies (e.g. Apache Commons etc.)
    • 👌 Improved robustness of checksum verification for text files used in datasets (e.g. license files)
    • 👌 Improved error messages in WebAnno TSV3 module
    • 🛠 Fixed crash in WebannoTsv3XWriter when annotations do not start/end at token boundaries
    • 🛠 Fixed bug in WebAnno TSV3 support causing span annotations with slot features to disappear
    • 🛠 Fixed trimming of whitespace in TeiReader
    • 🛠 Fixed bug in NifWriter causing named entity identifier not to be written
    • 🛠 Fixed crash in BratReader with reading discontinuous segments
    • 🛠 Fixed problem in BratWriter when dealing with slot features
    • 🛠 Fixed metadata of CoNLL2012Writer
    • 🛠 Fixed potential problem of datasets being written outside their target directory
    • ⬇️ Dropped the GrAF I/O module since the upstream libraries are outdated and no longer maintained

    🚀 A more detailed overview of the changes in this release can be found here.

    Thanks for contributions go to: @az79nefy, @ramonziai, @manuelciosici, @Horsmann, @tilmanbeck

    ⬆️ When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.

  • v1.10.0 Changes

    September 10, 2018

    🚀 We are pleased to announce the release of

    DKPro Core 1.10.0

    a collection of interoperable software components for natural language
    🖨 processing (NLP) based on the Apache UIMA framework.

    https://dkpro.github.io/dkpro-core

    🚀 This is a feature release.

    Notable changes since DKPro Core 1.9.3

    • ➕ Added support for Arabic to CoreNlpSegmenter (thanks @Jibun)
    • ➕ Added support for Token "form" to CoNLL writers (thanks @Jibun)
    • ➕ Added ability to provide extra non-standard parameters to CoreNlpSegmenter (thanks @Jibun)
    • ➕ Added ArkTreet POS tagger trainer (thanks @schrieveslaach)
    • ➕ Added WebAnno TSV3 reader/writer
    • ➕ Added reader for Leipzig Corpora Collection
    • ⬆️ Upgraded to CoreNLP 3.9.1 (stanfordnlp and corenlp modules)
    • ⬆️ Upgraded to OpenNLP 1.9.0
    • ⬆️ Upgraded to PDFBox 2.0.9 (io-pdf module)
    • ⬆️ Upgraded to LanguageTool 4.2
    • ⬆️ Upgraded to CogComp 4.0.7 (lbj module)
    • ⬆️ Upgraded to Tika 1.18 (io-tika module)
    • 👌 Improved handling of multi-line annotations in brat module (thanks @parisni)
    • 🛠 Fix discontinuous annotations crashing the brat reader by reading only the first fragment
    • ➕ Added dataset description for GUM 4.1.0 dataset
    • Removed PARAM_INTERN_TAGS
    • 👌 Improved component metadata

    🚀 A more detailed overview of the changes in this release can be found here.

    Thanks for contributions go to: @Jibun, @parisni, @schrieveslaach, @jgrivolla

    ⬆️ When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.

  • v1.9.3 Changes

    July 28, 2018

    🚀 We are pleased to announce the release of

    DKPro Core 1.9.3

    a collection of interoperable software components for natural language
    🖨 processing (NLP) based on the Apache UIMA framework.

    🚀 This is a bug-fix and minor feature release.

    Notable changes since DKPro Core 1.9.2

    • ➕ Added ability to restore Backmapper alignment data after a CAS restore
    • ➕ Added ability to specify a cluster resource name for the ArkTweet POS-tagger trainer
    • Added PARAM_MODEL_ENCODING to TreeTaggerChunker
    • 🛠 Fixed issue that DictionaryAnnotator did not match at the sentence end
    • Ensured that all parameters have a description

    🚀 A more detailed overview of the changes in this release can be found here.

    Thanks for contributions go to: @nilsreiter, @mjunsilo, @schrieveslaach, @jkirsch

    ⬆️ When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.

  • v1.9.2 Changes

    July 28, 2018

    🚀 We are pleased to announce the release of

    DKPro Core 1.9.2

    a collection of interoperable software components for natural language
    🖨 processing (NLP) based on the Apache UIMA framework.

    🚀 This is a bug-fix and minor feature release.

    Notable changes since DKPro Core 1.9.1

    • 👍 Allow explicitly specifying a model artifact when running a model-based component
    • 🛠 Fixed auto-loading of models in CoreNLP module
    • 🛠 Fixed issue causing PdfReader to create annotations with leading/trailing whitespace
    • ➕ Added more OMTD-SHARE metadata and UIMA capabilities
    • Avoid failing when encountering a discontinuous segment in brat files

    🚀 A more detailed overview of the changes in this release can be found here.

    Thanks for contributions go to: @nilsreiter, @mjunsilo

    ⬆️ When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.

  • v1.9.1 Changes

    April 05, 2018

    🚀 We are pleased to announce the release of

    DKPro Core 1.9.1

    a collection of interoperable software components for natural language
    🖨 processing (NLP) based on the Apache UIMA framework.

    🚀 This is a bug-fix and minor feature release.

    Notable changes since DKPro Core 1.9.0

    • 📇 Included OMTD-SHARE metadata
    • 👌 Improved mapping capabilities and robustness of the BratReader
    • ➕ Added option to mark split tokens in CamelCasTokenSegmenter
    • 🛠 Fixed hash for CC-BY 4.0 license in dataset API
    • 🛠 Fixed NPE in CoNLL 2012 reader
    • ⬆️ Upgrade to LanguageTool 4.1
    • ⬆️ Upgrade to ICU4J 61.1
    • ⬆️ Upgrade to JTok 2.1.18
    • ⬆️ Upgrade to OpenNLP 1.8.4

    🚀 A more detailed overview of the changes in this release can be found here.

    Thanks for contributions go to: @nilsreiter, @mjunsilo

    ⬆️ When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.