Planetary Sensing Engine
Open-source spatiotemporal data fusion for Earth observation, climate science, and planetary intelligence.
PSE ingests data from six live sources, normalizes everything into a common spatiotemporal model, and serves the result through a single REST API. Quality-weighted fusion blends overlapping sources by reliability and recency. Full provenance metadata traces every data point back to its origin.
Built for the Pangeo scientific Python ecosystem — xarray, Dask, Zarr, and the tools that climate scientists and Earth observation researchers already use.
6 connectors · Quality-weighted fusion · REST API · xarray · Apache 2.0
THE ENGINE
What PSE does
Earth observation data is fragmented. Temperature comes from ECMWF. Solar irradiance comes from the Global Solar Atlas. Satellite imagery comes from Copernicus. Infrastructure maps come from OpenStreetMap. Socioeconomic indicators come from the World Bank. Each source has its own format, coordinate system, temporal resolution, and access method.
PSE unifies them. It ingests data from multiple sources, normalizes everything into a common spatiotemporal model, and serves the result through a single REST API. When multiple sources provide the same variable, PSE uses quality-weighted fusion — blending them based on reliability and recency rather than picking one. Every response carries full provenance metadata, tracing each data point back to its original source and transformation history.
The output is a standard xarray Dataset, fully compatible with the Pangeo scientific Python ecosystem — Dask, Zarr, Xarray, and the tools that thousands of climate scientists and Earth observation researchers already use.
ARCHITECTURE
How PSE fits in the Northflow platform
PSE and HGE are two distinct engines that serve different functions in the Northflow architecture.
PSE senses.
It ingests, fuses, and serves observational data about the physical world.
HGE thinks.
It generates, ranks, and evaluates scientific hypotheses from data.
Domain adapters connect to whichever engines they need. FLUX (renewable energy intelligence) is built on PSE. CERES (famine early warning) is built on HGE. As the platform matures, adapters will draw from both engines simultaneously — PSE providing the observations, HGE generating the hypotheses.
Domain Adapters
FLUX · CERES · MARVIS · AION · Laplace · ...
PSE
Senses
HGE
Thinks
Earth Data
Data & Hypotheses
DATA SOURCES
Six live data connectors
Each connector implements a common interface, making it straightforward to add new sources.
ERA5
ECMWF Climate Reanalysis
Global atmospheric reanalysis produced by the European Centre for Medium-Range Weather Forecasts. Covers temperature, wind components, solar radiation, precipitation, humidity, surface pressure, and more. PSE handles async job submission, polling, and download through the Copernicus Climate Data Store API.
Resolution: 0.25° (~28 km)
Update: Daily, 5-day lag
Open-Meteo
Weather Archive and Forecast
Open-access weather data providing historical archive and short-range forecasts. 24 variables including temperature, wind, humidity, solar irradiance, cloud cover, soil moisture, and evapotranspiration. Hourly resolution. PSE automatically routes requests between the archive and forecast APIs based on a 5-day cutoff and fetches grid points in parallel.
Resolution: Hourly
Update: Continuous
Global Solar Atlas
Solar Resource Data
Long-term solar resource data from the World Bank. Provides Global Horizontal Irradiance (GHI), Direct Normal Irradiance (DNI), Diffuse Irradiance (DIF), Global Tilted Irradiance (GTI), and photovoltaic power output potential. Delivered as monthly climatology with annual totals.
Resolution: ~1 km
Update: Monthly climatology
Sentinel-2
Optical Satellite Imagery
Multispectral satellite imagery from ESA’s Sentinel-2 constellation via the Copernicus Data Space Ecosystem. PSE derives NDVI (vegetation index), NDWI (water index), and a 4-class land use classification from the Scene Classification Layer. OAuth2 authentication with automatic cloud filtering.
Resolution: 10–60 m
Update: 5-day revisit
OpenStreetMap
Infrastructure and Land Use
Global infrastructure data from the OpenStreetMap community via the Overpass API. PSE extracts power substations, transmission lines, road networks, settlement boundaries, and waterway density. Returns aggregate metrics per grid cell with full GeoJSON geometries stored in dataset attributes.
Resolution: Vector
Update: Community-maintained
World Bank
Socioeconomic Indicators
Country-level development indicators from the World Bank Open Data API. Covers electricity access rates, GDP per capita, renewable energy share, CO₂ emissions, population, and rural/urban electrification disparities. Annual values broadcast to spatial grids for integration with other sources.
Resolution: Country-level
Update: Annual
FUSION
The fusion engine
When a query spans multiple data sources, PSE does not simply pick one. It retrieves data from all relevant connectors in parallel, then fuses the results through a multi-step pipeline.
Spatial alignment
Sources arrive at different resolutions — ERA5 at 0.25°, Sentinel-2 at 10 m, World Bank at country level. PSE regrids everything to a common spatial grid using linear interpolation, with fallback broadcasting for sources with coarser resolution.
Temporal alignment
Sources have different time axes — hourly weather, daily reanalysis, monthly climatology, annual statistics, static snapshots. PSE classifies each source’s temporal structure and aligns them to a common time axis appropriate for the query.
Quality-weighted merge
When multiple sources provide the same variable (for example, both ERA5 and Open-Meteo provide temperature), PSE blends them using quality scores computed as 0.7 × source reliability + 0.3 × data recency. This produces more accurate results than any single source alone.
Conflict detection
PSE monitors the coefficient of variation across sources for each variable. When sources disagree significantly, the conflict is flagged in the response metadata so downstream applications can handle it appropriately.
Provenance tracking
Every fused dataset includes complete provenance metadata: which sources contributed, what transformations were applied, quality scores for each source, and the timestamp of the fusion operation. Any result can be traced back to its origins.
API
The API
PSE exposes all capabilities through a REST API built on FastAPI.
Health and Status
/api/v1/healthReturns the operational status of every connector, including data freshness, lag time, available variables, and update frequency. Also reports cache statistics and database connectivity.
/api/v1/sourcesLists all registered data connectors with their current status.
/api/v1/sources/{id}/statusDetailed freshness and last-ingest information for a specific connector.
Data Queries
/api/v1/pointPoint query for a single latitude/longitude coordinate. Returns time-series values for requested variables from the best available sources.
/api/v1/queryBounding box query returning a gridded dataset as JSON. Specify the spatial extent, time range, variables, and target resolution.
/api/v1/fuseExplicit multi-source fusion request. Submit a structured request body specifying exactly which sources to fuse and how to weight them.
Response format
All data responses return xarray-compatible JSON structures with coordinates (latitude, longitude, time), data variables, and attributes including full provenance metadata. Export to Zarr, NetCDF, GeoJSON, and CSV is supported.
STORAGE
Storage and caching
Zarr Store
All gridded array data is stored in Zarr format with daily partitioning, organized as {root}/{source_id}/{YYYY-MM-DD}/. Storage works on local disk or Amazon S3 via fsspec, making PSE cloud-ready without code changes.
PostgreSQL/PostGIS
Point observations, ingest records, and data source status metadata are stored in PostgreSQL with the PostGIS spatial extension, enabling fast spatial queries and metadata lookups.
Intelligent Cache
PSE maintains an in-process TTL + LRU cache keyed by SHA-256 hashes of query parameters. Cache entries are source-aware, supporting prefix-sweep invalidation when a specific connector’s data is refreshed.
COMMERCIAL ADAPTER
FLUX — First commercial adapter
FLUX is the first commercial application built on PSE. It provides AI-powered intelligence for renewable energy siting and development in developing countries.
FLUX queries PSE for solar irradiance, wind speed, temperature, terrain, land use, and grid infrastructure data, then runs the results through a pipeline of domain-specific models:
Solar PV yield modeling
Using pvlib — from solar position calculation through irradiance decomposition, plane-of-array transposition, cell temperature modeling, and system losses to hourly energy output. Five panel types with P50/P90/P10 uncertainty bounds.
Wind energy yield modeling
Log-law hub-height extrapolation, Weibull distribution fitting, and cubic power curve modeling with cut-in, rated, and cut-out wind speeds. Four turbine models.
Grid infrastructure analysis
Identifies nearest substations from OpenStreetMap data, calculates connection distances, estimates required voltage levels based on project capacity, and models connection costs.
Financial modeling
Levelized Cost of Energy (LCOE), Internal Rate of Return (IRR) via Newton-Raphson method, Net Present Value (NPV), simple payback period, and debt service calculations with country-specific benchmarks.
Climate risk assessment
Evaluates six hazards (flood, cyclone, extreme heat, wind storm, drought, sea level rise) using elevation, latitude, and climate zone heuristics, producing a weighted composite risk score with mitigation recommendations.
Natural language explanation
Generates a structured narrative interpreting all scores, identifying the strongest and weakest dimensions, and producing a prioritized list of actionable recommendations.
Country configurations are live for Indonesia, Kenya, and Vietnam, with country-specific grid data, electricity tariffs, regulatory exclusion zones, and financial parameters.
FLUX is proprietary. PSE is open source. This is the model: the data infrastructure is free, the intelligence built on top of it is the product.
EXTENSIBILITY
Extending PSE
PSE is designed to be extended. Adding a new data source means implementing the BaseConnector abstract class — approximately 100 lines of Python defining how to fetch data, assess quality, and report freshness for your source.
The connector interface ensures that every new source automatically works with the fusion engine, the cache, the API, and every adapter built on PSE. A connector for ocean temperature data, once written, immediately becomes available to FLUX for renewable energy intelligence and to any future adapter built on PSE.
Priority connectors for future development
- AIS vessel tracking data (BarentsWatch, AISstream)
- CMIP6 climate model projections
- Air quality observations
- Ocean sensor networks
- Agricultural yield statistics
Contributions are welcome. The source code is available at github.com/northflowlabs/pse under the Apache 2.0 license. See the contributing guide for instructions on submitting new connectors.
SPECIFICATIONS
Technical summary
| Attribute | Detail |
|---|---|
| Language | Python 3.11+ |
| License | Apache 2.0 |
| Codebase | ~12,500 lines across 69 files |
| Tests | 200+ passing (unit + integration) |
| Data format | xarray Dataset (CF-conventions) |
| Storage | Zarr (local/S3) + PostgreSQL/PostGIS |
| API | FastAPI (async) |
| Connectors | ERA5, Open-Meteo, Global Solar Atlas, Sentinel-2, OpenStreetMap, World Bank |
| Fusion | Quality-weighted (0.7×reliability + 0.3×recency) |
| Ecosystem | Pangeo (Xarray, Dask, Zarr, fsspec) |
| Deployment | Docker Compose, Railway, or any cloud provider |
LINKS
Access PSE
Live API
The running PSE instance — health, sources, queries, and fusion.
pse.northflow.no →Source Code
Full source under Apache 2.0. Connectors, fusion engine, API, tests.
github.com/northflowlabs/pse →Documentation
Architecture guide, connector interface reference, deployment instructions.
github.com/northflowlabs/pse/docs →