All Versions
9
Latest Version
Avg Release Cycle
101 days
Latest Release
1605 days ago

Changelog History

  • v2.1.0 Changes

    December 01, 2019

    ๐Ÿš€ We are pleased to announce the release of

    DKPro Core 2.1.0

    a collection of interoperable software components for natural language processing (NLP) based on the Apache UIMA framework version 3.

    https://dkpro.github.io/dkpro-core

    ๐Ÿš€ This is a feature release.

    Notable changes since DKPro Core 2.0.0

    • โž• Added option to export XMI using XML 1.1 to avoid issues with certain characters
    • โž• Added option to CoNLL readers to trim off whitespace from field values to avoid users having issues with incidental space characters (default is on)
    • โž• Added support for annotator notes in brat format
    • ๐Ÿ‘Œ Improved speed for writing WebAnno TSV format (backported from WebAnno)
    • ๐Ÿ›  Fixed a couple of issues with the CoNLL 2012 format
    • ๐Ÿ›  Fixed default extension for CoNLL-U writer
    • ๐Ÿ›  Fixed problem in CoNLL-U writer when text contains line breaks
    • ๐Ÿ›  Fixed problem that LanguageToolChecker did not fill in suggestions
    • ๐Ÿ›  Fixed setting div type on paragraphs created by CoNLL-U reader

    ๐Ÿš€ A more detailed overview of the changes in this release can be found [2].

    Thanks to all contributors!

    โฌ†๏ธ When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.

    ๐Ÿš€ [1] https://github.com/dkpro/dkpro-core/releases/tag/rel%2Fdkpro-core-2.1.0
    [2] https://github.com/dkpro/dkpro-core/issues?q=milestone%3A2.1.0

  • v2.0.0 Changes

    September 08, 2019

    ๐Ÿš€ We are pleased to announce the release of

    DKPro Core 2.0.0

    a collection of interoperable software components for natural language processing (NLP) based on the Apache UIMA framework.

    https://dkpro.github.io/dkpro-core

    ๐Ÿš€ This is a feature release.

    โฌ†๏ธ Important upgrade notice

    This version requires UIMA v3.

    โฌ†๏ธ If you are upgrading from DKPro Core 1.10.x or earlier, please read the DKPro Core 1.11.0 upgrade notice [1].

    Notable changes since DKPro Core 1.11.1

    • Switched to UIMAv3
    • โž• Added filling in suggestions to LanguageToolChecker
    • โž• Added support for notes to BratReader
    • โž• Added basic read support for Perseus XML format
    • ๐Ÿ‘Œ Improved error message when StanfordNamedEntityRecognizerTrainer is called without training data
    • ๐Ÿšš Moved StopwordRemover to tokit module and removed stopwordremover module
    • ๐Ÿ“‡ Renamed lancaster module to smile
    • โœ‚ Removed Tag type from syntax module
    • ... and a few additional under-the-hood changes

    ๐Ÿš€ A more detailed overview of the changes in this release can be found [2].

    Thanks for contributions go to: @alaindesilets, @mischor

    โฌ†๏ธ When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.

    ๐Ÿš€ [1] https://github.com/dkpro/dkpro-core/releases/tag/rel%2Fdkpro-core-2.0.0
    [2] https://github.com/dkpro/dkpro-core/issues?q=milestone%3A2.0.0

  • v1.12.0 Changes

    December 01, 2019

    ๐Ÿš€ We are pleased to announce the release of

    DKPro Core 1.12.0

    a collection of interoperable software components for natural language processing (NLP) based on the Apache UIMA framework version 2.

    https://dkpro.github.io/dkpro-core

    ๐Ÿš€ This is a feature release.

    โฌ†๏ธ Important upgrade notice

    โฌ†๏ธ If you are upgrading from DKPro Core 1.10.x or earlier, please read the DKPro Core 1.11.0 upgrade notice [1].

    Notable changes since DKPro Core 1.11.1

    • โž• Added option to export XMI using XML 1.1 to avoid issues with certain characters
    • โž• Added option to CoNLL readers to trim off whitespace from field values to avoid users having issues with incidental space characters (default is on)
    • โž• Added support for annotator notes in brat format
    • ๐Ÿ‘Œ Improved speed for writing WebAnno TSV format (backported from WebAnno)
    • ๐Ÿ›  Fixed a couple of issues with the CoNLL 2012 format
    • ๐Ÿ›  Fixed default extension for CoNLL-U writer
    • ๐Ÿ›  Fixed problem in CoNLL-U writer when text contains line breaks
    • ๐Ÿ›  Fixed problem that LanguageToolChecker did not fill in suggestions

    ๐Ÿš€ A more detailed overview of the changes in this release can be found [2].

    Thanks to all contributors!

    โฌ†๏ธ When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.

    ๐Ÿš€ [1] https://github.com/dkpro/dkpro-core/releases/tag/rel%2Fdkpro-core-1.12.0
    [2] https://github.com/dkpro/dkpro-core/issues?q=milestone%3A1.12.0

  • v1.11.1 Changes

    August 17, 2019

    ๐Ÿš€ We are pleased to announce the release of

    DKPro Core 1.11.1

    a collection of interoperable software components for natural language processing (NLP) based on the Apache UIMA framework.

    https://dkpro.github.io/dkpro-core

    ๐Ÿ›  This is a bugfix release.

    โฌ†๏ธ Important upgrade notice

    โฌ†๏ธ If you are upgrading from DKPro Core 1.10.x or earlier, please read the DKPro Core 1.11.0 upgrade notice [1].

    Notable changes since DKPro Core 1.11.0

    • ๐Ÿ›  Fixed trimming of whitespace at the start and end of annotations
    • ๐Ÿ›  Fixed encoding of named entity categories in LIF format
    • ๐Ÿ›  Fixed unescaping of URI-encoded characters when writing files
    • โž• Added parameter to control whitespace normalization in HtmlDocumentReader
    • โž• Added parameters to control indentation and output method in XmlDocumentWriter
    • ๐Ÿ‘Œ Improved exception in Stanford CoreNLP NER trainer when no documents have been processed

    ๐Ÿš€ A more detailed overview of the changes in this release can be found [2].

    Thanks for contributions go to: @az79nefy, @ramonziai, @manuelciosici, @Horsmann, @tilmanbeck, @alaindesilets, @jcklie

    โฌ†๏ธ When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.

    ๐Ÿš€ [1] https://github.com/dkpro/dkpro-core/releases/tag/dkpro-core-1.11.0
    [2] https://github.com/dkpro/dkpro-core/issues?q=milestone%3A1.11.1

  • v1.11.0 Changes

    July 05, 2019

    ๐Ÿš€ We are pleased to announce the release of

    DKPro Core 1.11.0

    a collection of interoperable software components for natural language processing (NLP) based on the Apache UIMA framework.

    https://dkpro.github.io/dkpro-core

    ๐Ÿš€ This is a feature release.

    โฌ†๏ธ Important upgrade notice

    • ๐Ÿ”„ Changed groupIds and artifactIds. The group ID is now org.dkpro.core and the artifact IDs are dkpro-core-...-(asl/gpl)
    • ๐Ÿ”„ Changed package names. The packages are now all starting with org.dkpro.core... - except the packages of UIMA types which remain unchanged for data compatibility.

    Notable changes since DKPro Core 1.10.0

    • ๐Ÿ”„ Changed parts of the brat data conversion code such that it can be more easily used outside a UIMA component
    • ๐Ÿ”„ Changed type mapping such that out-of-tagset types map to the generic type (e.g. an unknown POS tag maps to POS, not to POS_X)
    • ๐Ÿ”„ Changed name of NYTCollectionReader to NitfReader
    • โž• Added types to encode XML document structure in CAS
    • โž• Added new XmlDocumentReader/Writer components using these types
    • โž• Added basic reader for Annotated Gigaword corpus (only reads text so far) (thanks @az79nefy)
    • โž• Added basic support for PubAnnotation JSON format
    • โž• Added Maui component for keyword assignment
    • โž• Added parameter to SfstAnnotator to enable lower-case lookup of first word in a sentence (thanks @rziai)
    • โž• Added "order" feature to Token type
    • โž• Added support for CoNLL-U document and paragraph IDs (thanks @manuelciosici)
    • โž• Added support for CoNLL-U sentence IDs and text
    • โž• Added standardized parameter to disable type mapping
    • โž• Added support for TCF orthography layer using SofaChangeAnnotations
    • โž• Added segmenter for Chinese using jieba (thanks @Horsmann)
    • โž• Added MyStem for Russian
    • โž• Added links to OpenMinTeD categories in type system documentation
    • โž• Added support for the reading/writing the CoreNLP CoNLL flavor
    • โž• Added parameter to configure the Tika buffer size (useful for large documents)
    • โšก๏ธ Updated to OpenNLP 1.9.1
    • โšก๏ธ Updated to CoreNLP 3.9.2
    • โšก๏ธ Updated to ICU4J 64.2
    • โšก๏ธ Updated to Tika 1.19.1
    • โšก๏ธ Updated to LanguageTool 4.3
    • โšก๏ธ Updated to PDFBox 2.0.12
    • โšก๏ธ Updated IllinoisNLP components
    • โšก๏ธ Updated TreeTagger models/binaries in build.xml script (thanks @tilmanbeck)
    • โšก๏ธ Updated LIF dependencies
    • โšก๏ธ Updated dataset descriptions
    • โšก๏ธ Updated various general dependencies (e.g. Apache Commons etc.)
    • ๐Ÿ‘Œ Improved robustness of checksum verification for text files used in datasets (e.g. license files)
    • ๐Ÿ‘Œ Improved error messages in WebAnno TSV3 module
    • ๐Ÿ›  Fixed crash in WebannoTsv3XWriter when annotations do not start/end at token boundaries
    • ๐Ÿ›  Fixed bug in WebAnno TSV3 support causing span annotations with slot features to disappear
    • ๐Ÿ›  Fixed trimming of whitespace in TeiReader
    • ๐Ÿ›  Fixed bug in NifWriter causing named entity identifier not to be written
    • ๐Ÿ›  Fixed crash in BratReader with reading discontinuous segments
    • ๐Ÿ›  Fixed problem in BratWriter when dealing with slot features
    • ๐Ÿ›  Fixed metadata of CoNLL2012Writer
    • ๐Ÿ›  Fixed potential problem of datasets being written outside their target directory
    • โฌ‡๏ธ Dropped the GrAF I/O module since the upstream libraries are outdated and no longer maintained

    ๐Ÿš€ A more detailed overview of the changes in this release can be found here.

    Thanks for contributions go to: @az79nefy, @ramonziai, @manuelciosici, @Horsmann, @tilmanbeck

    โฌ†๏ธ When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.

  • v1.10.0 Changes

    September 10, 2018

    ๐Ÿš€ We are pleased to announce the release of

    DKPro Core 1.10.0

    a collection of interoperable software components for natural language
    ๐Ÿ–จ processing (NLP) based on the Apache UIMA framework.

    https://dkpro.github.io/dkpro-core

    ๐Ÿš€ This is a feature release.

    Notable changes since DKPro Core 1.9.3

    • โž• Added support for Arabic to CoreNlpSegmenter (thanks @Jibun)
    • โž• Added support for Token "form" to CoNLL writers (thanks @Jibun)
    • โž• Added ability to provide extra non-standard parameters to CoreNlpSegmenter (thanks @Jibun)
    • โž• Added ArkTreet POS tagger trainer (thanks @schrieveslaach)
    • โž• Added WebAnno TSV3 reader/writer
    • โž• Added reader for Leipzig Corpora Collection
    • โฌ†๏ธ Upgraded to CoreNLP 3.9.1 (stanfordnlp and corenlp modules)
    • โฌ†๏ธ Upgraded to OpenNLP 1.9.0
    • โฌ†๏ธ Upgraded to PDFBox 2.0.9 (io-pdf module)
    • โฌ†๏ธ Upgraded to LanguageTool 4.2
    • โฌ†๏ธ Upgraded to CogComp 4.0.7 (lbj module)
    • โฌ†๏ธ Upgraded to Tika 1.18 (io-tika module)
    • ๐Ÿ‘Œ Improved handling of multi-line annotations in brat module (thanks @parisni)
    • ๐Ÿ›  Fix discontinuous annotations crashing the brat reader by reading only the first fragment
    • โž• Added dataset description for GUM 4.1.0 dataset
    • Removed PARAM_INTERN_TAGS
    • ๐Ÿ‘Œ Improved component metadata

    ๐Ÿš€ A more detailed overview of the changes in this release can be found here.

    Thanks for contributions go to: @Jibun, @parisni, @schrieveslaach, @jgrivolla

    โฌ†๏ธ When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.

  • v1.9.3 Changes

    July 28, 2018

    ๐Ÿš€ We are pleased to announce the release of

    DKPro Core 1.9.3

    a collection of interoperable software components for natural language
    ๐Ÿ–จ processing (NLP) based on the Apache UIMA framework.

    ๐Ÿš€ This is a bug-fix and minor feature release.

    Notable changes since DKPro Core 1.9.2

    • โž• Added ability to restore Backmapper alignment data after a CAS restore
    • โž• Added ability to specify a cluster resource name for the ArkTweet POS-tagger trainer
    • Added PARAM_MODEL_ENCODING to TreeTaggerChunker
    • ๐Ÿ›  Fixed issue that DictionaryAnnotator did not match at the sentence end
    • Ensured that all parameters have a description

    ๐Ÿš€ A more detailed overview of the changes in this release can be found here.

    Thanks for contributions go to: @nilsreiter, @mjunsilo, @schrieveslaach, @jkirsch

    โฌ†๏ธ When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.

  • v1.9.2 Changes

    July 28, 2018

    ๐Ÿš€ We are pleased to announce the release of

    DKPro Core 1.9.2

    a collection of interoperable software components for natural language
    ๐Ÿ–จ processing (NLP) based on the Apache UIMA framework.

    ๐Ÿš€ This is a bug-fix and minor feature release.

    Notable changes since DKPro Core 1.9.1

    • ๐Ÿ‘ Allow explicitly specifying a model artifact when running a model-based component
    • ๐Ÿ›  Fixed auto-loading of models in CoreNLP module
    • ๐Ÿ›  Fixed issue causing PdfReader to create annotations with leading/trailing whitespace
    • โž• Added more OMTD-SHARE metadata and UIMA capabilities
    • Avoid failing when encountering a discontinuous segment in brat files

    ๐Ÿš€ A more detailed overview of the changes in this release can be found here.

    Thanks for contributions go to: @nilsreiter, @mjunsilo

    โฌ†๏ธ When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.

  • v1.9.1 Changes

    April 05, 2018

    ๐Ÿš€ We are pleased to announce the release of

    DKPro Core 1.9.1

    a collection of interoperable software components for natural language
    ๐Ÿ–จ processing (NLP) based on the Apache UIMA framework.

    ๐Ÿš€ This is a bug-fix and minor feature release.

    Notable changes since DKPro Core 1.9.0

    • ๐Ÿ“‡ Included OMTD-SHARE metadata
    • ๐Ÿ‘Œ Improved mapping capabilities and robustness of the BratReader
    • โž• Added option to mark split tokens in CamelCasTokenSegmenter
    • ๐Ÿ›  Fixed hash for CC-BY 4.0 license in dataset API
    • ๐Ÿ›  Fixed NPE in CoNLL 2012 reader
    • โฌ†๏ธ Upgrade to LanguageTool 4.1
    • โฌ†๏ธ Upgrade to ICU4J 61.1
    • โฌ†๏ธ Upgrade to JTok 2.1.18
    • โฌ†๏ธ Upgrade to OpenNLP 1.8.4

    ๐Ÿš€ A more detailed overview of the changes in this release can be found here.

    Thanks for contributions go to: @nilsreiter, @mjunsilo

    โฌ†๏ธ When upgrading, please mind that you should not mix different versions of DKPro Core components in your projects - they may not be compatible with each other.