GEO-MLLMs : MLLMs Assisted Image Geolocation

This is not published work. It's only an idea with some code behind it. This page will be updated with publication updates and other updates with time.

Demo

Description

Image geolocation refers to the process of automatically estimating, or finding the location of an image based on just the pixels in the image and no other metadata. Image geolocation is often done by using subtle clues and signs in an image (e.g., French on a signpost suggests the image originates from a french speaking country). Image geolocation is a pertinent topic, especially considering the attribution and verification of media originating from warzones or conflicts. This project proposes a multipart system (GEO-MLLMs) for estimating the location of an image using a MMLLM (Multi-modal Large Language Model), and a pre-existing geolocation method. We predict that our MLLM-based approach will outperform existing approaches to image geolocation on benchmark datasets (yfcc4k, im2gp3k).

How It Works:

Previously, methods attempting image geolocation have used specialized models which are meant for the specific task of image geolocation. GEO-MLLMs changes this by applying a MMLLM, which is meant for general usage and shows common-sense reasoning on some tasks, on the task of image geolocation. GEO-MLLMs first processes the image through an existing AI image geolocation tool, called GEOClip, and then produces 5 sets of predictions from GEOClip for what the image's location might be. These 5 predictions are then fed into the GPT 4o Mini MLLM along with a prompt to guess the image's true location.

I used the GPT 4o Mini MLLM over other MLLMs because of it's superior performance than most other available MLLMs.

Results:

GPT 4o Mini batch calls JSONL files:

(https://www.kaggle.com/datasets/sahalmulki/gpt-4o-json-files)

YFCC and IM2GPS subsets

https://www.kaggle.com/datasets/sahalmulki/llms-for-image-geolocation-benchmarks

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
22image_720.png		22image_720.png
2image_720.png		2image_720.png
GEO_MLLMs_Demo.ipynb		GEO_MLLMs_Demo.ipynb
README.md		README.md
geoclip_predictions_im2gps3k.csv		geoclip_predictions_im2gps3k.csv
geoclip_predictions_yfcc4k.csv		geoclip_predictions_yfcc4k.csv
geoclip_preds_gen_im2gps3k.ipynb		geoclip_preds_gen_im2gps3k.ipynb
geoclip_preds_gen_yfcc4k.ipynb		geoclip_preds_gen_yfcc4k.ipynb
gpt-batch-utils.ipynb		gpt-batch-utils.ipynb
im2gps.pdf		im2gps.pdf
im2gps_page-02001.jpg		im2gps_page-02001.jpg
image.png		image.png
image_720.png		image_720.png
image_7220.png		image_7220.png
median_page-0002221.jpg		median_page-0002221.jpg
notebook4be782c702.ipynb		notebook4be782c702.ipynb
ollama-testing.ipynb		ollama-testing.ipynb
parsing-results.ipynb		parsing-results.ipynb
yfcc.pdf		yfcc.pdf
yfcc_page-00201.jpg		yfcc_page-00201.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GEO-MLLMs : MLLMs Assisted Image Geolocation

Demo

Description

How It Works:

Results:

GPT 4o Mini batch calls JSONL files:

YFCC and IM2GPS subsets

About

Releases

Packages

Languages

sahal-mulki/geo-llm

Folders and files

Latest commit

History

Repository files navigation

GEO-MLLMs : MLLMs Assisted Image Geolocation

Demo

Description

How It Works:

Results:

GPT 4o Mini batch calls JSONL files:

YFCC and IM2GPS subsets

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages