10

8

6

4

2


9.2

9.5

8.7
0.0

8.2

7.8

5.9

9.4

4.7

3.0

5 Web Crawling libraries and projects

  • jsoup

    9.2 9.5 L2 Java
    jsoup: the Java HTML parser, built for HTML editing, cleaning, scraping, and XSS safety.
  • Crawler4j

    8.7 0.0 L2 Java
    Open Source Web Crawler for Java
  • InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
    Promo www.influxdata.com
    InfluxDB Logo
  • Apache Nutch

    8.2 7.8 L2 Java
    Apache Nutch is an extensible and scalable web crawler
  • storm-crawler

    5.9 9.4 Java
    A scalable, mature and versatile web crawler based on Apache Storm
  • Sparkler

    4.7 3.0 Java
    Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.

Add another 'Web Crawling' Library