EXPERT OPINION / PERSPECTIVE — We’re at the moment witnessing a mobilization of technical ambition paying homage to the Manhattan Venture, a realization that information and compute are the brand new defining components of nationwide energy. I’m deeply energized by latest daring strikes in Washington, particularly the White Home’s launch of the “Genesis Mission” this previous November—an initiative designed to federate huge federal scientific datasets for built-in AI coaching—alongside the real-world deployment of GenAI.mil.
But, once I have a look at the speed of the business sector—from OpenAI launching its devoted Science division and NVIDIA making an attempt to simulate the planet with Earth-2, to Google DeepMind aggressively crossing their AI breakthroughs into the geospatial area—it turns into clear that we’re nonetheless aiming too low. These initiatives should not simply modeling information; they’re making an attempt to mannequin actuality itself. American technical management is paramount, however that management is meaningless if it’s not ruthlessly and instantly utilized to our nationwide safety framework. We should take these large, reality-simulating ideas and focus them particularly on the GEOINT mission.
An ideal instance of that is that earlier this 12 months, in July 2025, the geospatial world shifted. Google DeepMind launched the AlphaEarth Foundations (AEF) mannequin, and thru the laborious work of the Taylor Geospatial Engine (TGE) and the open-source group, these vector embeddings at the moment are publicly obtainable on Supply Cooperative.
From Google
The thrill is justified. AlphaEarth is a leap ahead as a result of it provides pixel-level embeddings relatively than the usual patch-level strategy. It doesn’t simply inform you “this 256×256 sq. comprises a metropolis”; it tells you “this particular pixel is a part of a constructing, and it is aware of its neighbors.”
However as I have a look at this achievement from the angle of nationwide safety, I see one thing else. I see a proof of idea for a functionality that the USA is uniquely positioned to construct—and should construct—to take care of determination benefit.
Google has the web’s information. However the intelligence group holds essentially the most various, multi-physics, and temporally deep repository of the Earth in human historical past.
It’s time for the USA to suggest and execute a Nationwide Geospatial-Intelligence Embedding Mannequin (NGEM).
Join the Cyber Initiatives Group Sunday publication, delivering expert-level insights on the cyber and tech tales of the day – on to your inbox. Join the CIG publication at this time.
The Proposal: Past RGB
The AlphaEarth mannequin is spectacular, however it’s restricted by its coaching information—primarily business optical imagery. Within the nationwide safety area, an optical picture is simply the tip of the spear. We do not simply see with mild; we see with physics.
I’m proposing that we practice a large, pixel-level basis mannequin that ingests all of its holdings. We aren’t speaking about simply throwing extra Sentinel-2 information at a GPU. We’re speaking a few mannequin that generates embeddings from a unified ingest of:
- Multi-INT Imagery: Electro-optical (EO), Artificial Aperture Radar (SAR), Infrared/Thermal, Multispectral, and Hyperspectral.
- Vector Knowledge: The large shops of Basis GEOINT (FG)—roads, borders, elevation meshes.
- The Crucial Lacking Modality: Textual content. We should embed the tens of millions of intelligence reviews, analyst notes, and completed intelligence merchandise ever written.
The Method: “The Unified Latent Area”
The strategy would mirror the AlphaEarth structure—producing 64-dimensional (or greater) vectors for each coordinate on Earth—however with a large improve in complexity and utility.
In AlphaEarth, a pixel’s embedding vector encodes “visible similarity.” In an NGA NGEM, the embedding would encode phenomenological and semantic fact.
We might practice the mannequin to map completely different modalities into the identical “latent area.”
- If a SAR picture reveals a T-72 tank (by means of radar returns), and an EO picture reveals a T-72 tank (by means of visible pixels), and a textual content report describes a “T-72 tank,” they need to all map to almost the identical mathematical vector.
- The mannequin turns into the common translator. It does not matter if the enter is a paragraph of textual content or a thermal signature; the output is a standardized mathematical illustration of the article.
The Outcomes: What Does This Give Us?
If we obtain this, we transfer past “pc imaginative and prescient” into “machine understanding.”
1. The “SAM Website” Dimension Within the AlphaEarth evaluation, researchers discovered a “dimension 27” that by accident specialised in detecting airports. It was a serendipitous discovery of the mannequin’s inside logic. If we practice NSEM on NGA’s holdings, we gained’t simply discover an airport dimension. We are going to probably discover dimensions that correspond to particular nationwide safety targets.
- Dimension 14 would possibly mild up just for Floor-to-Air Missile (SAM) websites, no matter whether or not they’re camouflaged in optical imagery, as a result of the thermal and SAR layers give them away.
- Dimension 42 would possibly monitor “maritime logistics exercise,” integrating port vectors with ship signatures.
2. Cross-Modal Search (Textual content-to-Pixel) Presently, if an analyst needs to search out “all airfields with prolonged runways within the Pacific,” they need to depend on tagged metadata or run a particular pc imaginative and prescient classifier. With a multi-modal embedding mannequin, the analyst might merely kind a question from a report: “Suspected development of hardened plane shelters close to distinct ridge line.” As a result of we embedded the textual content of tens of millions of previous reviews alongside the imagery, the mannequin understands the semantic vector of that phrase. It will possibly then scan your complete globe’s pixel embeddings to search out the mathematical match—immediately highlighting the situation, even when no human has ever tagged it.
3. Vector-Based mostly Change Detection AlphaEarth confirmed us that subtracting vectors from 2018 and 2024 reveals development. For the intelligence group, this turns into Automated Indications & Warning (I&W). As a result of the embeddings are spatially conscious and pixel-dense, we are able to detect delicate shifts within the perform of a facility, not simply its footprint. A manufacturing unit that all of a sudden begins emitting warmth (thermal layer) or displaying new materials stockpiles (hyperspectral layer) will produce a large shift in its vector embedding, triggering an alert lengthy earlier than a human analyst notices the visible change.
The Cipher Transient brings expert-level context to nationwide and international safety tales. It’s by no means been extra essential to know what’s occurring on the planet. Improve your entry to unique content material by changing into a subscriber.
The Intelligence Use Circumstances
- Automated Order of Battle: Immediately producing dynamic maps of navy tools by querying the embedding area for particular signatures (e.g., “Present me all vectors matching a cell radar unit”).
- Underground Facility Detection: By combining vector terrain information, gravity/magnetic anomaly information, and hyperspectral floor disturbances right into a single embedding, the mannequin might “see” what’s hidden.
- Sample of Life Evaluation: For the reason that mannequin is spatiotemporal (like AlphaEarth), it learns the “heartbeat” of a location. Deviations—like a port going silent or a sudden surge in RF exercise—turn out to be mathematical anomalies that scream for consideration.
Conclusion
Google and the open-source group have given us the blueprint with AlphaEarth. They proved that pixel-level, spatiotemporal embeddings are the superior technique to mannequin our altering planet.
However the mission requires greater than business information. It requires the fusion of each sensor and each secret. By constructing this multi-modal embedding mannequin—fusion on the pixel stage—we are able to cease in search of needles in haystacks and begin utilizing a magnet.
That is the way forward for GEOINT. We’ve the information. We’ve the mission. It’s time to construct the mannequin.
Observe Mark Munsell on LinkedIn.
The Cipher Transient is dedicated to publishing a spread of views on nationwide safety points submitted by deeply skilled nationwide safety professionals.
Opinions expressed are these of the writer and don’t signify the views or opinions of The Cipher Transient.
Have a perspective to share primarily based in your expertise within the nationwide safety area? Ship it to [email protected] for publication consideration.
Learn extra expert-driven nationwide safety insights, perspective and evaluation in The Cipher Transient, as a result of nationwide safety is everybody’s enterprise.










