Sunday, August 7, 2011

How to query by proximity in Android

The problem

So you need to query objects near some location?
You say ‘hey, this is Android, it should be easy, everybody makes a “search-near-me app”’, right?


Well, the truth is that is not that easy. Making a query for items near a geographic point is not as easy as one might expect. One of the reasons is that SQLite (SQL engine that comes with Android) doesn’t support trigonometric operations, or custom functions, or spatial extensions.







The short answer

In a nutshell, the solution is to use an alternative distance function to order objects based on this distance to the reference point and filter a few to show them on a map. Specifically, use the Manhattan distance that is based on simple operations easily written in Sqlite. You will get a small ordering error but will be good enough for many cases.

Where

            d1 is the distance function

 are vectors and

pi and qi are scalar values.

 The long answer

Let’s see step by step what should you do.

Limit the result set

The first thing we need to accept and to do is to reduce the number of elements we are handling. There are two good reasons for this. The first one is usability; users don’t feel comfortable if you throw a hundreds of items over the map. The second is performance, there’s no point on processing elements that won’t be visible.


So the first thing you should do is to decide the number of elements you will be showing the user. That might depend on the type of application, but you should be careful not to show less or more than expected. Make some tests on real phones to discover hardware limitations to this number.


Let’s say 40 elements is ok for us. We know that we don’t care about all the results; we just want the first forty elements near to the reference location (i.e.: the phone current location). Because we are using an approximate distance for our calculation, we will make the limit a little higher. So let’s make it 65 elements. I will explain why later.


At the end of the query you should have a limit clause, something like ‘LIMIT 65’ as we want the first 65 elements.

Order results by proximity using Manhattan distance

This is a proximity based result display, so we want to show the user the first nearest elements. For that purpose we need to order the results, otherwise they will be in any random order and you will get any 65 elements.


Due to our simple SQL support for mathematical operations we will use a simple form of distance called “Manhattan distance” or taxicab distance (http://en.wikipedia.org/wiki/Taxicab_geometry).


This kind of distance is very easy to calculate for Sqlite and has a small error compared to using a real geometric distance for ordering.


Add an order by clause: ‘ORDER BY abs(latitude - (?)) + abs( longitude - (?))
Here I’m assuming the query arguments are the user current coordinates (latitude and longitude) and ‘latitude’ and ‘longitude’ are the columns with the values for each element.


The query should look something like this:
"SELECT _id, latitude, longitude FROM EXAMPLE_DOTS ORDER BY abs(latitude - (?)) + abs( longitude - (?)) LIMIT 65"
And we will add as arguments the latitude and longitude of the reference point.

Error in Manhattan distance

Ordered in this way, our query will be enough to make the trick. If you only show the elements in the map, and they are scattered enough, nobody will notice the difference.
But, if you must show a list of elements, or they are not so sparsed you will still have to make a few corrections.


The Manhattan distance has a little problem (compared to real distance); it favors elements close to the axes. That means that elements near the vertical or horizontal axis will get lower values for their distance. That means better positions on the order, and will likely to be first that other elements. In other words elements near 45º, 135º, 225º and 315º will receive a worst order than they should.


If elements are regularly scattered you will get a
diamond-like shape with manhattan distance.


When using this alternative distance to order elements and limiting the results you could include an element farther that is close to an axis and exclude an object that is nearer because it’s not close to the axis.


This is why we add additional elements on the limit clause; we don’t want axis elements to be favored and diagonal elements to be discarded erroneously. To avoid this situation we will use a real geo-distance (at least a far better approximation to real distance).

Use Android Location-distance calculator to reorder and filter

The Android platform comes with an excellent implementation of an approximate geo distance calculation formula based on WGS84 ellipsoid (http://en.wikipedia.org/wiki/World_Geodetic_System#Longitudes_on_WGS_84).
This formula cannot be used inside the query, but as we limited the number of results (only 65) we can use it to re-order them in memory very fast so we have a real distance ordering.


You should call Location.distanceBetween(startlatitude, startLongitude, endLatitude, endLongitude, results) to calculate a better distance and re-order the elements obtained from the database. You can then, discard the exceeding elements.


I do this by using a few helper classes that I attached to this article as an Eclipse project :


/**
        * Order the results based on real distance instead of manhattan. This method changes the order
        * using memory
        *
        * @param referencePoint
        *
        * @param nearDots
        *            Dots to order
        * @param resultLimit
        *            Amount of dots to be returned
        * @return The list of re-ordered and limited elements
        */
       private List<ExampleDot> orderAndFilterUsingRealDistanceTo(GeoPoint referencePoint, List<ExampleDot> nearDots,
                    int resultLimit) {

             final double referenceLatitude = referencePoint.getLatitudeE6() / 1e6;
             final double referenceLongitude = referencePoint.getLongitudeE6() / 1e6;

             DistanceCalculator<ExampleDot> distanceCalculator = new DistanceCalculator<ExampleDot>() {
                    public Double calculateDistanceTo(ExampleDot point) {

                           double pointLatitude = point.latitude / 1e6;
                           double pointLongitude = point.longitude / 1e6;

                           float[] results = new float[1];
                           Location.distanceBetween(referenceLatitude, referenceLongitude, pointLatitude, pointLongitude, results);
                           return (double) results[0];

                    }
             };

             Comparator<? super ExampleDot> distanceComparator = DistanceComparator.create(distanceCalculator, true);

             List<ExampleDot> filtered = ResultsHelper.filterFirstMin(resultLimit, nearDots, distanceComparator);

             return filtered;
       }

As you can see, the above method is used to re-order the elements using a geo-distance and filtering out the exceeding elements. So, now we have the forty nearest elements we wanted.


With real geo-distance and regularly scattered elements
you should get a circle-like shape (or approximate).

Conclusion

The method I showed above is not for every case but can be used for most of the applications that need to show results near some point.
In fact, just by using Manhattan distance on the query can be sufficient and you don’t need to make extra corrections.


You should be sensible as when to use it due to the error in the order that Manhattan distance makes. I recommend you to test with real data points so you can judge if the results are good enough for you.


Links and downloads: Full eclipse example project
More articles from 10pines.com: http://www.10pines.com/articles

No comments:

Post a Comment