LLM Geocoding

I was playing around with Google's Geocoding API when I thought to plug in my address into o1 for fun and curiosity's sake. Imagine my surprise when it actually got the coordinates right down to the first 3 decimal places (precision of within hundreds of meters)! Considering the fact that it got in the near vicinity without any internet access at all, that's pretty impressive. Naturally, the next step was to host a battle royale with a bunch of other widely-used LLMs to see which one is best for this "offline" geocoding (yes, I'm considering closed-weights LLMs in this, just imagine you're in the book Project Hail Mary, and you have access to every computer program ever).

I first created a dataset of 121 locations with the coordinates from Google Maps as the source of truth.

Next, I chose 7 LLMs to compare [o1, claude-3.7-sonnet, DeepSeek R1, DeepSeek v3, gemini-2.5-pro-preview, llama_3_1_405B_instruct, llama_4_maverick_17B_128E_instruct].

Then, I gave the dataset with all of the coordinates removed to each LLM with this prompt:

Given following address: {address}, give the coordinates for each of them. I do not want any code to find these out or any services. Just give it your best shot to get the coordinates to 12 decimal places in Latitude and Longitude columns. Return ONLY a valid JSON object with exactly the following keys:
        {"Latitude": , "Longitude": }
        Do not include any additional commentary, explanations, or formatting.

Lastly, I plotted all of the given coordinates on a map.

As you can see, results widely vary depending on model used and the part of the world the point is located. Here are some things that I found interesting:

The more well known your location is, the more accurate it will be.
The thinking models are really good at ambiguous locations (ex: That one McDonald's in downtown Seattle that always has stabbings has all of the thinking models really close by the actual point).
The models tend to just pick the city center or the most popular area if it isn't 100% sure where it is.

Model	Mean Distance (km)	Median Distance (km)	Max Distance (km)	Min Distance (km)	Standard Deviation (km)	Addresses Attempted (%)	Cost
gemini_2_5_pro_preview	1.04	0.63	7.68	0.001	1.25	94.21	[1]
chatgpt_o1	1.34	0.84	11.27	0.004	1.51	100.00	$11.14
claude_3_7_sonnet	1.67	1.19	12.16	0.001	1.88	100.00	$0.32
deepseek_r1	1.91	1.22	12.60	0.004	2.09	97.52	[2]
llama_3_1_405B_instruct	6.46	1.46	471.24	0.003	44.58	90.91	[2]
llama_4_maverick_17B_128E_instruct	16.07	1.57	1290.66	0.037	117.88	100.00	[2]
deepseek_v3	36.78	1.35	4141.50	0.003	374.74	100.00	[2]

Gemini 2.5 Pro pretty much refused to geocode any coordinates it was unsure about.
I tried using the official DeepSeek API, but the R1 model would not answer any queries relating to Taiwan, however DeepSeek V3 will.

Conclusion

Can you use LLMs to geocode your addresses? Yes. Will it be accurate? Unless it's a really famous place, no.

It would be really interesting to see if the model creators decided to start explicitly folding in geospatial data into their training. Google would probably have the best dataset out of all of them due to their Maps data, and their model already has the most accurate geocoding from what I've tested. Perhaps Apple or Meta would excel in vision based models, due to the amount of exif-based geotagging metadata they get from photos.

Appendix

[1] "Free" at the time of writing.

[2] I used Fireworks.AI for all of the open-weight models. Total cost was $0.64 combined.

[3] If you truly needed coordinates for your addresses you would use something like Google Maps Geocoding or Smarty :P

[4] I found it really difficult to get addresses for pretty much any country not in the G7. As a result, I had to pick some points of interests that were more likely to have an address, like a business or similar, which would make it more likely to be in the training data of the model, compared to picking a random residential address.

Dataset Download: address_source_of_truth.csv