Playing with maps

[Versión en castellano]This link opens in a popup window

First and foremost, and if you don’t know very well what all this is about, spatial data is related to geographical and regional information, including Point of interest – POIThis link opens in a popup window –, areas on a map – lines, polygons, etc. –, and information about municipaliy, state or country.

I should tell you, that talking about spatial data is quite complex, and it’s a software discipline on its own. Proxia® provides spatial support that is limited to a set of features, including :

  • Retrieve the points or areas to render, based on different filters.
  • Group them if necessary (to avoid the map being filled with pins).
  • Support for geographical and regional based information. Including, for example, POIS at some distance, or even POIS in a certain municipality.

Approach

When I decided to tackle this problem, I needed to provide response to different questions. Perhaps the main one was, where the grouping logic should be done? Although there are many JavaScript libraries (GoogleThis link opens in a popup window, LeafletThis link opens in a popup window, etc.) that address this problem, I didn’t think that they were the proper approach, since :

  • They could introduce a dependency with the mapping software manufacturer.
  • We aim to support different devices, where the development language might not be JavaScript (think about Koltin, Swift, Java). If we use different approachs, the user experience – how grouping is done – will change based on user device.
  • Being honest, being able to return 5 groups, why do you want to response with more than 100K POIS in a single HTTP request ?

Taking this decision is slippery, since unless we want to penalize the user experience, we must optimize the processing time. Providing a good solution involves knowing how mapping technology - like Google Maps, Leaflet or Bing Maps – work.

POIS retrieval

As I told you, to retrieve the POIS – or areas – we must know how mapping technology works. Basically, we should consider the following issues :

  • Although the Earth is a sphere (in fact, a geoid), we are used to work with planes. This transformation is achieved through a projection, the geographic coordinates became cartesian. Since Internet map providers use the MercatorThis link opens in a popup window projection, we have to work with this kind of coordinates in our application instead of geographical ones.
  • As you know the Earth is big. When we use a physical plane we work with scales, using a different scale involves using a different map. Map providers use the same approach, but instead of talking about scales they talk about zoom levels. Each zoom level is, in fact, a different plane.

Obviously there isn’t a plane of all the Earth for each zoom level, this wouldn’t be very practical, its size in KB would be humongous for the high-zoom levels and, actually, we are just working with a small area of all the Earth. Thus, the maps are divided in a grid composed of tiles. The more zoom you apply, the greater the number of tiles.

Therefore, POIS retrieval involves knowing the tiles that are being requested. You could think of it as a function of zoom level, and visible map boundaries. For each tile we recover the visible POIS inside it. If we talk about areas, we just need to know if one the area points is contained within the tile. Actually, building all of this is not really complex but you have to be careful, providing the system with different key-value layers, since you want a swift data recovery.

mapas00mapas00

As you can notice, low-zoom layers contain a lot of POIS, as zoom increases the number of POIS in each tile decreases.

Obviously, it makes no sense to store in each layer all POI information, we’d be wasting a lot of memory/storage, so we have to implement another layer to store POIS shared data. This additional layer could include other filters as, for example, POI type, or postal address. In this way, seeking POIS in a geographic area is, in fact, a matter of locating proper tiles, retrieieving POIS and then, finally, filtering those who match the user’s selection.

You might wonder, what about storage? Do you have all this information in a database? Actually, no, since in database we only store the basic information, enough to create all these layers structure when the system starts up.

This process introduces, in fact, a new problem, how can the system deal with the creation, update or removal of POIS? Let me tell you, that we are using a broadcasting solution and that I’ll tell you about it in a future post.

POIS grouping

Certainly, a map filled with pins is not a good solution for end-users. POIS Grouping allows us to provide a solution focused on user experience.

mapas01mapas01

From a technical point of view, you could think that this grouping process is not a very complex problem. All you have to do is to find neighbouring POIS and mark them as a single cluster. In fact, it is a bit trickier, you should :

  • Select an empiric neighbouring distance, enough as to avoid cluster collision.
  • Create an initial fake cluster, and check if POIS are near to it. If itsn’t the case, create new a cluster point.
  • Find the best cluster position (lat&lon), taking into account all POIS involved.
  • Refinate process iteratively, trying to minimise the number of returned clusters.

On the other hand, grouping just for the sake of it, doesn’t make sense. We have to include other approaches to improve user experience as, for example, including group qualifiers. A group qualifier allows us to group similar POIS. Just imagine a system comprised of medical centers and hosting facilities, being close cannot be the only criteria for grouping a series of POIS, at least at medium zoom levels.

mapas02mapas02

Proxia® takes into account all these issues, allowing system administrator to create different grouping qualifiers, structured in hierarchies. Allowing us to provide our clients with a grouping solution independent of user device, based on the zoom level and which considers only the POIS recovered in the retrieval step.

Information retrieval based on geographical and regional criteria

In this case we are not talking about the representation of POIS in map, but to retrieve information based on criteria as in the surroundings, in this municipality, 25 kilometers from here, etc. You could think of it like the typical recommendation service: users who bought this product, also bought these others.

Retrieving information based on regions is, in fact, quite easy, you should only store an encoded postal address (country, region, state, municipality, town) within the information and seek proper matches. For this coding, we have used Spanish INEThis link opens in a popup window database and Portuguese FreguesíasThis link opens in a popup window database.

The problem arises when you try to apply this idea to criteria such as in the surroundings or 25 kilometers from here. The reason is that calculating the distance between two points is not a simple task. Computing the actual distance, road-based, is extremely complex, and even computing radio-based distance isn’t simple since you cannot use Pythagorean theorem, remember that the Earth is not flat, but algorithms as Harvesine formulaThis link opens in a popup window.

Our approach, a batch process to store these precomputed relationships. This gives us enough power and flexibility as to return related information in a extremely short time.

Credits

Map images provided by OpenStreetMapsThis link opens in a popup window, other icons used :