Suppose you have a travel planning tool, like wanderlog, and you need to display the details for all the places in your plan. Fetching the place details for all those items, multiplied by the number users, multiplied by the times an average user views/edits his plan, means that on a relatively medium or even small scale - the startup will go bankrupt.
Now, suppose you have the money to pay : Since Google's Place Details API does not support Bulk fetching, you need to fetch the details for all the places in your plan separately. The only performance-viable solution would mean to spawn multiple async tasks to fetch those details in parallel. However, very soon you would reach rate limit.
What about caching? : The current caching terms of use, unlike some years ago, allow you to cache only Lat+Long and place id. (not even the title!)
So, either companies like wanderlog/tripit/etc. simply cache all the details (which is illegal), or there is another solution that I'm missing here.
Would love to hear your experiences or simply opinions on the matter.
Google tends to have a greater number of commercial POIs (since it caters to potential advertisers), but OSM is catching up in this area as well.
Full disclosure: GeoDesk develops a spatial database engine for OpenStreetMap data, so obviously biased on this issue.
Are there any plug-and-play options where I can just download a docker image and get a simple backend that can be used for frontend maps and an address auto-complete? Alternatively a separate docker image for the tile server (if that is the right terminology). Preferably with an option to only include data for certain geographical areas to save space.
Theoretically, we'd like to merge OSM data with some licensed databases. And then perhaps it could "compete" with Google's offering.
However, we're not sure which are good sources for the licensed DBs and how much do these licenses cost ? Is it something feasible?
Combining datasets: You'll have to check the terms of the license agreements. In general, for ODbL-licensed data (like OSM), if you generate a "derivative database" (e.g. by merging multiple datasets), you will have to release this derivative database under ODbL terms as well. This would essentially rule out commercial data.
One way companies avoid this is by segregating the datasets, e.g. use a commercial dataset for hotel data, and use OSM data for streets and non-hotel POIs, without ever combining them into a single database. (But you'll have to check with an IP lawyer to be certain about the correct way to implement this).
This sort of thing is problematic for travel startups that help plan or share venues/places for example. We already heard some potential users of our product saying "we only use and trust google for maps/places", and realized the bar is quite high in terms of what we have to offer in that respect.
Regarding the ODbL licensed OSM data, good thing you mentioned that! Do you know by any chance what would be the magnitude of costs for licensed databases that would cover the majority of businesses/venues/places in the world (excluding non-touristic countries/cities, e.g N. Korea, Afghanistan, etc.)?
I'm trying to understand if that's really an option, or one must have Google's resources to be able to set up this sort of thing. (We do have the tech and engineering talent on our team, we're just somewhat scarce on our financial resources)
Thank you!
Licensed databases are going to be 'call for price', but you might see if you can get free/low cost feeds from companies that want to sell something: food ordering apps might let you have a list of restaurants if you link the restaurant to their app; you might be able to get a hotel feed from an OTA like Expedia, etc. Maybe Yelp has a feed??
Otherwise, you've got to find data sources and figure out how much they want, or talk to your legal team and build crawlers. I used to work at Yahoo! Travel, and we had a lot of data sources, I can't remember all of them, and was never involved in pricing discussions, but I think we tended to include attribution, if you dig into the internet archive. The base hotel feed came from Sabre, restaurants came from multiple sources (and Y!Local which had its own sources), points of interest and city information came from guide book providers like Fodors and Lonely Planet, and I think a couple others. Ski conditions came from OnTheSnow (which was a pleasure to work with).
They could of course charge 1M USD per request or whatever, but then nobody would use their APIs. And if that's what they want, it's easier to just stop providing third-party access, no?
I mean, obviously (?) the goal is to make money I think. But if you price it such that only very big players are able to afford it... then...I feel like I'm missing a basic economics explanation here.
Sure, greed and all that, but... I am still baffled so it's clear that I don't understand.
Hopefully someone can provide some light on this.
Data - it seems it's mostly OSM, which then leads us to thinking we're simply better off creating an OSM db ourselves then + put that content into a maps service that helps with tile serving. But then I'm not sure how cost effective it will be.
If you just need map tiles, there are specialized providers (e.g. thunderforest.com) that have far more competitive pricing.