10

8

6

4

2


9.2

9.1

8.7
0.0

8.1

8.0

5.9

8.8

4.8

3.0

5 Web Crawling libraries and projects

  • jsoup

    9.2 9.1 L2 Java
    jsoup: the Java HTML parser, built for HTML editing, cleaning, scraping, and XSS safety.
  • Crawler4j

    8.7 0.0 L2 Java
    Open Source Web Crawler for Java
  • The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
    Promo workos.com
    WorkOS Logo
  • Apache Nutch

    8.1 8.0 L2 Java
    Apache Nutch is an extensible and scalable web crawler
  • storm-crawler

    5.9 8.8 HTML
    A scalable, mature and versatile web crawler based on Apache Storm
  • Sparkler

    4.8 3.0 Java
    Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.

Add another 'Web Crawling' Library