Popularity

9.2

Stable

Activity

9.1

Growing

Stars 9,460

Watchers 490

Forks 2,694

Last Commit 8 days ago

Description

Stanford CoreNLP provides a set of natural language analysis tools written in Java. It can take raw human language text input and give the base forms of words, their parts of speech, whether they are names of companies, people, etc., normalize and interpret dates, times, and numeric quantities, mark up the structure of sentences in terms of phrases or word dependencies, and indicate which noun phrases refer to the same entities. It was originally developed for English, but now also provides varying levels of support for (Modern Standard) Arabic, (mainland) Chinese, French, German, and Spanish. Stanford CoreNLP is an integrated framework, which make it very easy to apply a bunch of language analysis tools to a piece of text. Starting from plain text, you can run all the tools with just two lines of code. Its analyses provide the foundational building blocks for higher-level and domain-specific text understanding applications. Stanford CoreNLP is a set of stable and well-tested natural language processing tools, widely used by various groups in academia, industry, and government. The tools variously use rule-based, probabilistic machine learning, and deep learning components.

The Stanford CoreNLP code is written in Java and licensed under the GNU General Public License (v3 or later). Note that this is the full GPL, which allows many free uses, but not its use in proprietary software that you distribute to others.

Code Quality Rank: L1

Programming language: Java

License: GNU General Public License v3.0 only

Tags: Natural Language Processing

Latest version: v4.2.0

CoreNLP alternatives and similar libraries

Based on the "Natural Language Processing" category.
Alternatively, view CoreNLP alternatives based on common mentions on social networks and blogs.

Apache OpenNLP

6.8 8.3 L1 CoreNLP VS Apache OpenNLP

Apache OpenNLP
Mallet

6.3 3.7 L2 CoreNLP VS Mallet

MALLET is a Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text.

WorkOS - The modern identity platform for B2B SaaS

The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

Promo workos.com

CogCompNLP

5.1 0.0 CoreNLP VS CogCompNLP

CogComp's Natural Language Processing Libraries and Demos: Modules include lemmatizer, ner, pos, prep-srl, quantifier, question type, relation-extraction, similarity, temporal normalizer, tokenizer, transliteration, verb-sense, and more.
DKPro Core

3.7 6.0 CoreNLP VS DKPro Core

Collection of software components for natural language processing (NLP) based on the Apache UIMA framework.
LingPipe

- CoreNLP VS LingPipe

Toolkit for a variety of tasks ranging from POS tagging to sentiment analysis.
DKPro

- CoreNLP VS DKPro

Collection of reusable NLP tools for linguistic pre-processing, machine learning, lexical resources, etc.

* Code Quality Rankings and insights are calculated and provided by Lumnify.
They vary from L1 to L5 with "L5" being the highest.

Do you think we are missing an alternative of CoreNLP or a related project?

Add another 'Natural Language Processing' Library

Popular Comparisons

README

Stanford CoreNLP

Stanford CoreNLP Provides a set of natural language analysis tools written in Java. It can take raw human language text input and give the base forms of words, their parts of speech, whether they are names of companies, people, etc., normalize and interpret dates, times, and numeric quantities, mark up the structure of sentences in terms of phrases or word dependencies, and indicate which noun phrases refer to the same entities. It was originally developed for English, but now also provides varying levels of support for (Modern Standard) Arabic, (mainland) Chinese, French, German, Hungarian, Italian, and Spanish. Stanford CoreNLP is an integrated framework, which makes it very easy to apply a bunch of language analysis tools to a piece of text. Starting from plain text, you can run all the tools with just two lines of code. Its analyses provide the foundational building blocks for higher-level and domain-specific text understanding applications. Stanford CoreNLP is a set of stable and well-tested natural language processing tools, widely used by various groups in academia, industry, and government. The tools variously use rule-based, probabilistic machine learning, and deep learning components.

The Stanford CoreNLP code is written in Java and licensed under the GNU General Public License (v2 or later). Note that this is the full GPL, which allows many free uses, but not its use in proprietary software that you distribute to others.

Build Instructions

Several times a year we distribute a new version of the software, which corresponds to a stable commit.

During the time between releases, one can always use the latest, under development version of our code.

Here are some helpful instructions to use the latest code:

Provided build

Sometimes we will provide updated jars here which have the latest version of the code.

At present, the current released version of the code is our most recent released jar, though you can always build the very latest from GitHub HEAD yourself.

Build with Ant

Make sure you have Ant installed, details here: http://ant.apache.org/
Compile the code with this command: cd CoreNLP ; ant
Then run this command to build a jar with the latest version of the code: cd CoreNLP/classes ; jar -cf ../stanford-corenlp.jar edu
This will create a new jar called stanford-corenlp.jar in the CoreNLP folder which contains the latest code
The dependencies that work with the latest code are in CoreNLP/lib and CoreNLP/liblocal, so make sure to include those in your CLASSPATH.
When using the latest version of the code make sure to download the latest versions of the corenlp-models, english-models, and english-models-kbp and include them in your CLASSPATH. If you are processing languages other than English, make sure to download the latest version of the models jar for the language you are interested in.

Build with Maven

Make sure you have Maven installed, details here: https://maven.apache.org/
If you run this command in the CoreNLP directory: mvn package , it should run the tests and build this jar file: CoreNLP/target/stanford-corenlp-4.5.1.jar
When using the latest version of the code make sure to download the latest versions of the corenlp-models, english-extra-models, and english-kbp-models and include them in your CLASSPATH. If you are processing languages other than English, make sure to download the latest version of the models jar for the language you are interested in.
If you want to use Stanford CoreNLP as part of a Maven project you need to install the models jars into your Maven repository. Below is a sample command for installing the Spanish models jar. For other languages just change the language name in the command. To install stanford-corenlp-models-current.jar you will need to set -Dclassifier=models. Here is the sample command for Spanish: mvn install:install-file -Dfile=/location/of/stanford-spanish-corenlp-models-current.jar -DgroupId=edu.stanford.nlp -DartifactId=stanford-corenlp -Dversion=4.5.1 -Dclassifier=models-spanish -Dpackaging=jar

Models

The models jars that correspond to the latest code can be found in the table below.

Some of the larger (English) models -- like the shift-reduce parser and WikiDict -- are not distributed with our default models jar. These require downloading the English (extra) and English (kbp) jars. Resources for other languages require usage of the corresponding models jar.

The best way to get the models is to use git-lfs and clone them from Hugging Face Hub.

For instance, to get the French models, run the following commands:

# Make sure you have git-lfs installed
# (https://git-lfs.github.com/)
git lfs install

git clone https://huggingface.co/stanfordnlp/corenlp-french

The jars can be directly downloaded from the links below or the Hugging Face Hub page as well.

Language	Model Jar	Last Updated
Arabic	download (HF Hub)	4.5.0
Chinese	download (HF Hub)	4.5.0
English (extra)	download (HF Hub)	4.5.0
English (KBP)	download (HF Hub)	4.5.0
French	download (HF Hub)	4.5.0
German	download (HF Hub)	4.5.0
Hungarian	download (HF Hub)	4.5.0
Italian	download (HF Hub)	4.5.0
Spanish	download (HF Hub)	4.5.0

Thank you to Hugging Face for helping with our hosting!

Useful resources

You can find releases of Stanford CoreNLP on Maven Central.

You can find more explanation and documentation on the Stanford CoreNLP homepage.

For information about making contributions to Stanford CoreNLP, see the file [CONTRIBUTING.md](CONTRIBUTING.md).

Questions about CoreNLP can either be posted on StackOverflow with the tag stanford-nlp, or on the mailing lists.

*Note that all licence references and agreements mentioned in the CoreNLP README section above are relevant to that project's source code only.

CoreNLP

CoreNLP: A Java suite of core NLP tools for tokenization, sentence segmentation, NER, parsing, coreference, sentiment analysis, etc.

Description

CoreNLP alternatives and similar libraries

Apache OpenNLP

Mallet