Google's AlphaEarth Foundation embeddings turned land use classification into a simple clustering problem. A non-expert produced granular land use maps in 2 weeks, including building all necessary tools from scratch. Traditional approaches require months of expert analysis, manual labeling, and ground truth validation.

Sat
Detail
80%
Leaflet © OpenStreetMap contributors
2024 land use classification for Auroville bio-region

AlphaEarth Eliminates Remote Sensing Complexity

Traditional satellite analysis requires spectral interpretation, atmospheric correction, and temporal compositing. AlphaEarth Foundation eliminates this complexity by providing pre-computed 64-dimensional embeddings that encode satellite observations with spatial, temporal, and multi-spectral context for a specified time period.

The study area covers 49 square kilometers of central Auroville and surrounding Tamil villages. At 10-meter resolution across 7km × 7km, this gives us 490,000 pixels.

Development Velocity: 2 Weeks from Zero to Maps

We built alpha-bhu for ML processing of AlphaEarth Foundation embeddings and geo-darshan for interactive web visualization.

One human non-expert (@restlessronin) working with (omniscient ;-) AI collaborators built the tooling in 2 weeks. Development included multiple failed clustering approaches, tool architecture changes, and UX iterations.

Three-Step Workflow

Download AlphaEarth embeddings from Google Earth Engine to Google Drive (finishes in minutes). Run FAISS K-means on 490,000 pixels across 64 dimensions - no preprocessing or hyperparameter tuning beyond choosing k. Evaluate clusters visually and manually assign labels to complete the process.

Mathematical cluster metrics proved less effective than human interpretation.

Multi-Scale Clustering Strategy

Failed attempts at k=10 taught us to over-cluster intentionally, then merge. Low k values mixed different land uses within clusters. k=22 cleanly separates land use types into individual clusters, which we then group into ~10 final categories. k=44 and k=88 reveal finer details within each land use type, showing crop varieties, building density gradients, and roads.

Practical Results

The clustering identified fallow fields, established orchards (cashew, casuarina, coconut), dense to sparse built environments, planted restoration forests, barren areas, roads, and water bodies. Different orchard types emerged clearly, revealing agricultural diversity that would require extensive field surveys using traditional methods.

Why This Approach Works

AlphaEarth's pre-computed embeddings shift the bottleneck from complex image processing to simple downloading. Clustering half a million pixels takes seconds on a Mac Air. No human expertise in remote sensing or ML is required - just basic Python and AI-assisted development enable rapid prototyping with immediate visual feedback.

Current Limitations

Visual evaluation only, without quantitative validation. Single time period analysis. Manual cluster interpretation.

Credits

Most code by @claude-4-sonnet (alpha-bhu ML processing, geo-darshan web mapping interface, and website integration), with @grok-4 (rate-limits hit / hard-to-debug problems) and @gemini-2.5-pro (rate limits). Article copy by @claude-4-sonnet.

Auroville GIS expertise provided by Pitchandikulam Forest, especially Azhagappan Mani.

Showrunner: @restlessronin.