Geospatial annotation with LabelMe and Segment Anything
I joined Robin Cole, who writes the satellite-image-deep-learning newsletter, for a podcast about annotating satellite imagery with LabelMe and Segment Anything. We also recorded a hands-on demo on real NAIP tiles.
- Substack post
- Podcast on YouTube (12 min)
- Demo on YouTube (16 min)
From robotics to satellites
I built LabelMe to label data for a robotics challenge. Today it's used for medical scans, factory defects, drone footage, and increasingly satellite imagery. The satellite case is interesting because three things landed in the last two years: SAM and SAM2 made click-prompted segmentation reliable, SAM3 added box-prompted segmentation that returns multiple shapes, and LabelMe v6 started opening multispectral and float32 GeoTIFFs that v5 refused to load.
A single aerial tile can hold hundreds of buildings, vehicles, or fields. Drawing polygons around each one by hand does not scale. SAM does.
Prompt and verify
Robin's framing for where annotation is heading: the model proposes, the human verifies. You stop drawing and start prompting. Click a building, get a mask. The rest is review — delete what's wrong, nudge what's close, leave the rest alone.
On clean tiles with well-defined objects, review takes 5–15 seconds per image. Drawing the same polygon by hand takes 30–60.
Why boxes beat text for satellite
One thing we got into on the podcast: text-prompted segmentation works well on consumer photos but tends to fall over on overhead imagery. The visual priors of "car" or "building" learned from web images don't match what those things look like from above, where every car is a top-down rectangle and every building is a roof.
Box prompting sidesteps that. You give the model a region and ask what is in here. No language grounding required. SAM2 and SAM3 are strong at this. The new SAM3 AI-Box mode in v6.1 takes a single box and returns multiple shapes, which is what you want on a tile full of cars or storage tanks.
End-to-end demo
The demo walks through one full loop. Open a NAIP aerial tile in LabelMe, annotate with AI-assisted polygons, export the result, and load the GeoJSON in QGIS. Same coordinate system, no manual reprojection.
The tile we used is 26,456 by 22,892 pixels — LabelMe v6 opens it; v5 wouldn't.

The tile, annotations, and full pipeline are on GitHub.
Offline by default
Geospatial datasets are often sensitive: defense, agriculture under NDA, infrastructure imagery. Everything in this workflow runs on your machine. SAM2 and SAM3 weights download once and run locally; tiles, prompts, and masks stay on disk. More on this in Why offline-first annotation matters.
If you work on Earth observation, satellite-image-deep-learning is the newsletter to read. Recent episodes covered Tessera, InstaGeo, AutoML for spaceborne AI, and methane plume detection.
LabelMe Pro includes the SAM2/SAM3 annotation suite and the export toolkit. $79, one-time.
LabelMe is an offline-first annotation tool with built-in AI.