Popularity

4.4

Stable

Activity

7.8

Declining

Stars 426

Watchers 30

Forks 90

Last Commit 10 days ago

Description

Java utility methods for geohashing.

Status: production, available on Maven Central

Maven site reports are here including javadoc.

Add this to your pom:

Code Quality Rank: L5

Programming language: Java

License: Apache License 2.0

Tags: Geospatial

Latest version: v0.7.7

Geo alternatives and similar libraries

Based on the "Geospatial" category.
Alternatively, view Geo alternatives based on common mentions on social networks and blogs.

GraphHopper

8.8 9.0 L2 Geo VS GraphHopper

Open source routing engine for OpenStreetMap. Use it as Java library or standalone web server.
GeoTools

7.8 9.4 L4 Geo VS GeoTools

Official GeoTools repository

Sevalla - Deploy and host your apps and databases, now with $50 credit!

Sevalla is the PaaS you have been looking for! Advanced deployment pipelines, usage-based pricing, preview apps, templates, human support by developers, and much more!

Promo sevalla.com

JTS Topology Suite

7.2 9.0 Geo VS JTS Topology Suite

The JTS Topology Suite is a Java library for creating and manipulating vector geometry.
Mapsforge

6.6 8.9 L3 Geo VS Mapsforge

Vector map library and writer - running on Android and Desktop.
Spatial4j

5.7 3.1 L4 Geo VS Spatial4j

LocationTech Spatial4j: A Geospatial Library for Java
H2GIS

3.5 9.5 L2 Geo VS H2GIS

A spatial extension of the H2 database.
Geo Assist

3.1 3.4 Geo VS Geo Assist

Geo Assist is a spatial library to manage spatial data in-memory.
Apache SIS

3.1 9.5 L2 Geo VS Apache SIS

Java language library for developing geospatial applications following OGC/ISO standards.
Geotoolkit.org

2.8 9.4 L4 Geo VS Geotoolkit.org

Geotoolkit.org (abridged Geotk) is a free software, Java language library for developing geospatial applications. The library can be used for desktop or server applications. Geotk is built on top of Apache SIS and is used as a laboratory for the later.
Jgeohash

2.6 0.0 L2 Geo VS Jgeohash

An easy-to-implement library for the GeoHash algorithm

* Code Quality Rankings and insights are calculated and provided by Lumnify.
They vary from L1 to L5 with "L5" being the highest.

Do you think we are missing an alternative of Geo or a related project?

Add another 'Geospatial' Library

Popular Comparisons

README

geo

Java utility methods for geohashing.

Features

simple api
encodes geohashes from latitude, longitude to arbitrary length (GeoHash.encodeHash)
decodes latitude, longitude from geohashes (GeoHash.decodeHash)
finds adjacent hash in any direction (GeoHash.adjacentHash), works on borders including the poles too
finds all 8 adjacent hashes to a hash (GeoHash.neighbours)
calculates hash length to enclose a bounding box (GeoHash.hashLengthToCoverBoundingBox)
calculates geohashes of given length to cover a bounding box. Returns coverage ratio as well (GeoHash.coverBoundingBox)
calculates height and width of geohashes in degrees (GeoHash.heightDegrees and GeoHash.widthDegrees)
encodes and decodes long values from geohashes (Base32.encodeBase32 and Base32.decodeBase32)
good performance (~3 million GeoHash.encodeHash calls per second on an I7, single thread)
no mutable types exposed by api
threadsafe
100% unit test coverage (for what that's worth of course!)
Apache 2.0 licence
Published to Maven Central

Status: production, available on Maven Central

Maven site reports are here including javadoc.

Getting started

Add this to your pom:

<dependency>
    <groupId>com.github.davidmoten</groupId>
    <artifactId>geo</artifactId>
    <version>VERSION_HERE</version>
</dependency>

Bounding box searches using geohashing

What is the problem?

Databases of events at specific times occurring at specific places on the earth's surface are likely to be queried in terms of ranges of time and position. One such query is a bounding box query involving a time range and position constraint defined by a bounding lat-long box.

The challenge is to make your database run these queries quickly.

Some databases may either not support or suffer significant performance degradation when large datasets are queried with inequality conditions on more than one variable.

For example, a search for all ship reports within a time range and within a bounding box could be achieved with a range condition on time combined with a range condition on latitude combined with a range condition on longitude ( combined with = logical AND). This type of query can perform badly on many database types, SQL and NoSQL. On Google App Engine Datastore for instance only one variable with inequality conditions is allowed per query. This is a sensible step to take to meet scalability guarantees.

What is a solution?

The bounding box query with a time range can be rewritten using geohashes so that only one variable is subject to a range condition: time. The method is:

store geohashes of all lengths (depends on the indexing strategies available, a single full length hash may be enough) in indexed fields against each lat long position in the database. Note that storing hashes as a single long integer value may be advantageous (see Base32.decodeBase32 to convert a hash to a long).
calculate a set of geohashes that wholly covers the bounding box
perform the query using the time range and equality against the geohashes. For example:

(startTime <= t < finishTime) and (hash3='drt' or hash3='dr2')

filter the results of the query to include only those results within the bounding box

The last step is necessary because the set of geohashes contains the bounding box but may be larger than it.

What hash length to use?

So how long should the hashes be that we try to cover the bounding box with? This will depend on your aims which might be one or more of minimizing: cpu, url fetch time, financial cost, total data transferred from datastore, database load, 2nd tier load, or a heap of other possible metrics.

Calling GeoHash.coverBoundingBox with just the bounding points and no additional parameters will return hashes of a length such that the number of hashes is as many as possible but less than or equal to GeoHash.DEFAULT_MAX_HASHES (12).

You can explicitly control maxHashes by calling GeoHash.coverBoundingBoxMaxHashes.

As a quick example, for a bounding box proportioned more a less like a screen with Schenectady NY and Hartford CT in USA at the corners:

Here are the hash counts for different hash lengths:

m is the size in square degrees of the total hashed area and a is the area of the bounding box.

length  numHashes m/a    
1           1     1694   
2           1       53     
3           4        6.6    
4          30        1.6    
5         667        1.08   
6       20227        1.02

Only testing against your database and your preferrably real life data will determine what the optimal maxHashes value is. In the benchmarks section below a test with H2 database found that optimal query time was when maxHashes is about 700. I doubt that this would be the case for many other databases.

A rigorous exploration of this topic would be fun to do or see. Let me know if you've done it or have a link and I'll update this page!

Hash height and width formulas

This is the relationship between a hash of length n and its height and width in degrees:

First define this function:

parity(n) = 0 if n is even otherwise 1

Then

width = 180 / 2(5n+parity(n)-2)/2 degrees

height = 180 / 2(5n-parity(n))/2 degrees

The height and width in kilometres will be dependent on what part of the earth the hash is on and can be calculated using Position.getDistanceToKm. For example at (lat,lon):

double distancePerDegreeWidth =
     new Position(lat,lon).getDistanceToKm(new Position(lat, lon+1));

Benchmarks

Inserted 10,000,000 records into an embedded H2 filesystem database which uses B-tree indexes. The records were geographically randomly distributed across a region then a bounding box of 1/50th the area of the region was chosen. Query performed as follows (time is the time to run the query and iterate the results):

hashLength numHashes  found   from  time(s) 
2          2          200K    10m   56.0    
3          6          200k    1.2m  10.5
4          49         200k    303k   4.5
5          1128       200k    217K   3.6
none       none       200k    200k  31.1 (multiple range query)

I was pleasantly surprised that H2 allowed me to put over 1000 conditions in the where clause. I tried with the next higher hash length as well with over 22,000 hashes but H2 understandably threw a StackOverFlowError.

To run the benchmark:

mvn clean test -Dn=10000000

Running with n=1,000,000 is much quicker to run and yields the same primary result:

multiple range query is 10X slower than geohash lookup if the hash length is chosen judiciously

Geo

Geohash utitlies in java

Description

Geo alternatives and similar libraries

GraphHopper

GeoTools

Sevalla - Deploy and host your apps and databases, now with $50 credit!

JTS Topology Suite

Mapsforge

Spatial4j

H2GIS

Geo Assist

Apache SIS

Geotoolkit.org

Jgeohash