
Given a latitude and longitude, I need to pull the nearest neighbors out of the database and return them in proximity order. To get this to work we use:
To do the query, we need to compute a bounding box:
swhash = Geohash.encode(latitude-delta, longitude-delta)
nehash = Geohash.encode(latitude+delta, longitude+delta)
This produced two hashcodes, one to the southwest of the orig, and one the northeast. A delta of 0.5 is about 138 KM, in North America, at least. Geohash codes are great because they can be sorted in lexical order, so my boto query now looks like this:
sdb = boto.connect_sdb(AWS_ACCESS_KEY, AWS_SECRET_KEY])
domain = sdb.get_domain(PLACES_DOMAIN)
places = domain.select("select * from `Places` where
`geohash` >= '%s' and `geohash` <= '%s' LIMIT 10"
%(swhash, nehash), max_items=10)
The places object is actually an iterator much thanks to Chris Moyer for documenting this. We want the array. NOTE, the query is lazily performed when you interpret the first token from the iterator.
results = [place for place in places]
The distance function is very simple:
def distance(a,b):
R = 6371 # radius of Earth in KM
lat1 = radians(a[0])
lon1 = radians(a[1])
lat2 = radians(b[0])
lon2 = radians(b[1])
d = math.acos(math.sin(lat1)*math.sin(lat2) +
math.cos(lat1)*math.cos(lat2) * math.cos(lon2-lon1)) * R;
return d
Finally, python sorts the result:
results = sorted(results, key=lambda place: distance(orig,
(float(place['latitude']), float(place['longitude']))))
This actually worked on the second try.
Notes: there are better distance functions, but this one works well and is quite fast. I have not put in the error handling, or the back off code if you get too many or too few results.