A web crawler is a bot program that fetches resources from the web for the sake of building applications like search engines, knowledge bases, etc. Sparkler (contraction of Spark-Crawler) is a new web crawler that makes use of recent advancements in distributed computing and information retrieval domains by conglomerating various Apache projects like Spark, Kafka, Lucene/Solr, Tika, and Felix. Sparkler is an extensible, highly scalable, and high-performance web crawler that is an evolution of Apache Nutch and runs on Apache Spark Cluster.
Recently added Sparkler resources
Be the first to add one!
Sparkler RecommendationsThere are no recommendations yet. Be the first to promote Sparkler!
Have you used Sparkler? Share your experience. Write a short recommendation and Sparkler, you and your project will be promoted on Awesome Java.
Sparkler alternatives and related libraries
Based on the "Web Crawling" category
* Code Quality Rankings and insights are calculated and provided by Lumnify.
They vary from L1 to L5 with "L5" being the highest. Visit our partner's website for more details.
Do you think we are missing an alternative of Sparkler or a related project?