A recent trend that has popped up since o3 and o4-mini has released has been to submit an image of a location and tell the LLM to guess where it is.
This is actually a pretty close concept to my previous post where I attempted to geocode coordinates based off the names of their locations.
I wanted to see how good the latest OpenAI model (o4-mini-high) was at this, so I decided to create a small 23 location dataset and provide my report below.
First, to create my dataset, I just jumped around on Google Maps like last time, but with the twist of being on street-view for this post. I also had to record the coordinates of the point in street-view so I could get the distance-error later on.
Next, I simply had to create a script to continously prompt the OpenAI API. Here is the prompt I used:
Look at this image and determine its geographic location.
Return ONLY a JSON object with the following fields:
- id: the image ID (provided)
- file: the image file path (provided)
- lat: latitude as a float
- lon: longitude as a float
- country: country name
- city: just the name of the city or the closest city/town
- notes: where you think the location is in English
Example format:
{"id":"000016", "file":"images/000016.png", "lat":35.64758313972121, "lon":139.62882171061867, "country":"Japan", "city":"Tokyo", "notes":""}
Do not include any explanations, just the JSON object.
Finally, I just had to plot the ground truth and the LLM responses on a map.
Considering that all o4-mini had to go on was a picture, it did really well! I think that it would have done better than 99% of people on the planet at least (perhaps there's some CIA analyst who's really good at GeoGuessr).
Next, I wanted to see how it did quantitatively.
Distance Error | Country (Ground Truth) | Country (Prediction) | Country Correct | City (Ground Truth) | City (Prediction) | City Correct |
---|---|---|---|---|---|---|
4.13 km | United States | United States | True | Denver | Denver | True |
0.46 km | United States | United States | True | Monterey Park | Monterey Park | True |
31.71 km | United States | United States | True | St. Petersburg | Palmetto | False |
174.36 km | United States | United States | True | Hoisington | Norton | False |
0.16 km | United States | United States | True | New York | New York | True |
0.78 km | France | France | True | Paris | Paris | True |
200.46 km | United Kingdom | United Kingdom | True | Oxford | Norwich | False |
0.75 km | Germany | Germany | True | Cologne | Cologne | True |
0.55 km | Kenya | Kenya | True | Nairobi | Nairobi | True |
2988.64 km | Democratic Republic of the Congo | South Africa | False | Yangambi | Mbombela | False |
0.43 km | Greece | Greece | True | Athens | Athens | True |
292.48 km | Taiwan | Taiwan | True | Taipei | Kaohsiung | False |
0.56 km | China | Hong Kong | False | Hong Kong | Hong Kong | True |
9044.68 km | China | Canada | False | Shanghai | Vancouver | False |
5.31 km | Australia | Australia | True | Auburn | Sydney | False |
0.55 km | Japan | Japan | True | Tokyo | Tokyo | True |
1.42 km | Brazil | Brazil | True | Rio de Janeiro | Rio de Janeiro | True |
0.86 km | Canada | Canada | True | Surrey | Surrey | True |
1.07 km | Canada | Canada | True | Montreal | Montreal | True |
0.49 km | Norway | Norway | True | Oslo | Oslo | True |
1531.21 km | Spain | Morocco | False | Barcelona | Essaouira | False |
1.92 km | Norway | Norway | True | Longyearbyen | Longyearbyen | True |
1.23 km | United States | United States | True | Salt Lake City | Salt Lake City | True |
Some things I found interesting:
Like the LLM geocoding test I did previously, o4-mini seems to do best if the location is well known enough to be placed in its training data.
The conclusion that I can take away from this is that if you post an image of yourself with enough background showing, it is likely that you can be geo-located to the city (with a fairly substantial margin of error to be fair). This does pose some concerns on how well you can conceal your location if you simply plan on posting your activities on social media. However, if you don't care that your smartphone probably exposes your location through EXIF metadata anyways[2], post away!
[1] If you count the Hong Kong city/country thing as true, the percentage of countries it got correct is 87%.
[2] Yes, I am aware that most social media websites strip EXIF data for you beforehand!
[3] Total cost of this project was $1.23 to OpenAI for their API.
Dataset Download: dataset.7z