Have downloaded 100s of GB of photos, I wanted to write my own photos search function. The approach I used is:

Use a LLM to generate descriptions of all photos.
Use reverse geocoding to map each photo to a location via the EXIF data
Scan through the LLM output to find matches for the search term.

I didn’t really have any good ideas around ranking, and it doesn’t seem to matter in the results.

LLM wise, I started out with Google Gemini. Based on some testing I thought I would need 50-100 quid for describing every photo, so I moved on to using local AI. This seems OK, and after a week or two had described every photo. Some photos are too large for Local AI, so I fell back to using Gemini for them.

For photo locations I used the Google maps API. This turned out to be really easy to use and to get an address. I reverse geolocated all photos into text files on disk.

This actually is only ~200Mb of data, so reading through it all on request was pretty fast. I found that parsing the location data JSON in each photo location felt slow, so again I precomputed just the addresses, which made processing requests faster. There’s speedups in parallelisation etc, but I haven’t done those yet.

I used Claude to help with writing code, which was great.