Franzen · Working Console

Geographic Intelligence Platform

A governed geographic data platform — not a database. Nothing enters the trusted system without source, date, owner, validity, currency, and use restrictions, enforced by the schema itself.

Scaffold Built AI Gate Proven 3,149 Counties (incl. CT) Live On Supabase
← Executive Summary Knowledge Base (Searchable) Project Log White Paper (PDF)
3,149
Counties In The Spine
84,412
Tracts Crosswalked
5
Reference Topics Loaded
11 / 11
Validation Checks Pass
5 / 5
AI-Gate Tests Pass
01

What We're Building

A trusted geographic intelligence platform that collects, validates, documents, and serves geographic data — built so trust is structural, not a matter of discipline.

Nothing is considered trusted merely because it loaded successfully.

The platform exists to answer geographic questions with data you can stand behind: every number carries its source, its date, its owner, and its rules of use. It serves chapters, disaster operations, field intelligence, county and census-tract analysis, dashboards, maps, AI assistants, and executive reports — all reading from one governed spine instead of a pile of disconnected spreadsheets.

Priority 1

Accuracy & Validity

Is the data correct? Bad rows are rejected at the database, not hoped against.

Priority 2

Currency & Freshness

How current is it? Every source has a refresh schedule and a last-checked date.

Priority 3

Latency & Speed

How fast can we use it? Speed matters — but trust matters more.

Who It Serves

Chapters · disaster operations · field intelligence · county intelligence · census-tract intelligence · dashboards · maps · AI assistants · executive reports.

The Big Insight

We are not trying to collect everything. We are building a system that knows what data exists, where it came from, how current it is, how trustworthy it is, what geography it applies to, who owns it, how it can be used, whether AI can touch it, how to update it, and how to explain it.

02

How We're Building It

A layered architecture where raw files flow toward trusted answers, and only an approved, AI-safe layer is ever exposed.

Diagram A · The Four-Layer Pipeline
Layer 1
Raw Archive
Original files exactly as received. Never edited. The audit trail.
Layer 2
Clean Data
Standardized Parquet / GeoParquet. Typed, deduped, documented.
Layer 3
Database
Postgres + PostGIS. The canonical, constraint-enforced store.
Layer 4 · Gated
AI / API / Map
Claude, Codex, dashboards, Vercel, ArcGIS, reports — read-only, approved fields only.
The dashed box is a gate, not a layer. The AI/API/map layer can only read what governance has approved for it — it cannot reach the raw or canonical tables.
Diagram B · Three Roles, One Rule
Builds
Claude Code
Creates schema, ingestion scripts, tests, docs, maps, field catalogs.
Challenges
Codex
Adversarial reviewer: bad joins, stale data, exposure risks, silent failures.
Approves
You · Gatekeeper
Source authority, sensitivity, public/internal use, final publishing.
Core rule both agents obey: Nothing enters canonical without provenance · nothing changes without validation · nothing is deleted without history · nothing internal is exposed without approval.
Diagram C · The Governance Lifecycle Of One Source
Step 1
Intake Form
Owner, sensitivity, join key, limitations captured before any load.
Step 2
Source Registry
A permanent provenance record. Every dataset gets one row.
Step 3
Ingest + Log
Load logged: found / loaded / changed / rejected, raw & clean paths.
Step 4
Validate
Tests run; results written. Failures flagged for review.
Step 5
Serve
Only fields marked approved-for-AI reach the serving layer.
03

Where The Data Lives

Four Postgres schemas separate concerns by trust level. Local for development and archive; one cloud Postgres as canonical.

Diagram D · Four Schemas, One Trust Boundary
gov
  • source_registry
  • field_catalog
  • ingestion_log
  • validation_results
  • dataset_inventory
Provenance, catalog, logs, validation.
geo
  • county_boundary
  • geography_assignment
  • tract_county_crosswalk
  • geography_spine (view)
  • geography_at() (fn)
The spine + crosswalks.
topic
  • fema_mobile_homes_county
  • census_demographics…
  • nri_risk_county…
  • svi_tract…
  • (joined by FIPS/GEOID)
Subject tables. No duplicated geography.
serving · AI-SAFE
  • geography_current
  • fema_mobile_homes_county
  • mobile_homes_with_chapter
  • approved views only
The only thing the AI/API role may read.
The serving schema (red) is the wall — and it's wired. The ai_reader role is granted serving only and revoked from the rest, so approved_for_ai is now an enforced gate, not a label. Proven: ai_reader reads serving.* and is denied every base table (make secure, 5/5 pass).
Diagram E · The Geography Spine — Why It's Two Tables
Stable
geo.county_boundary
One row per county. The polygon. Shapes don't change on realignment.
+
Temporal · SCD-2
geo.geography_assignment
Which chapter / region / division, with effective dates. Changes with Project Keystone.
Rebuilt As
geo.geography_spine (view)
The manual's one spine — with no duplicated polygons and no flag that can lie.
Historical correctness by construction: geography_at(fips, date) returns the assignment valid on the event date. A 2023 disaster resolves to the chapter that owned the county then, not now — validation concern #15, fixed in the schema rather than left to memory.

Local vs. Cloud

Local Machine

Development & Archive

Raw archives, DuckDB analytics, testing, experiments, backup copy. Originals are preserved exactly as received and kept out of git.

Cloud · Recommended

Supabase Postgres + PostGIS

App data, APIs, Vercel apps, shared query access, AI access. Matches the existing stack (Keystone, Final Mile, Ask Clara). Canonical store; DuckDB stays local for analytics. Awaiting your sign-off — see Decisions.

04

Progress

The scaffold is built and proven against a live database. Every architecture choice is logged and reversible.

Proven

Schema Applies Clean

All five SQL files run end-to-end on Postgres 16 + PostGIS 3.4.

Proven

Guards Reject Bad Data

Integer-coerced FIPS, a second "current" assignment, and orphan topic rows are all refused by the database.

Proven

History Resolves Right

A 2023 event joins to the old chapter; current joins to the new one.

Proven · New

The AI Gate Holds

ai_reader reads the serving layer and is denied every base table. 5/5 pass via make secure.

+ D-001 · Split geography spine into boundary + assignment ACCEPTED

The manual specifies one spine table holding polygon, assignment, dates, and a flag. We split it into geo.county_boundary (stable polygon) and geo.geography_assignment (temporal, SCD-2), and rebuilt the manual's spine as a view. Why: county shape is stable while chapter assignment changes — one table duplicates the full polygon on every realignment. Reversible on request.

+ D-002 · No stored current_flag ACCEPTED

is_current is derived (effective_end IS NULL) in the view; a unique partial index enforces one open assignment per county. Why: a stored flag can silently contradict the dates.

+ D-003 · Tract crosswalk + geography_at() ACCEPTED

Added geo.tract_county_crosswalk and geo.geography_at(fips, date). Why: the spine is county-grain; tract topic tables had no bridge. geography_at() returns the assignment valid at an event date — fixing the "current geography applied to historical events" concern by construction.

+ D-004 · FIPS and GEOID stored as text ACCEPTED

county_fips char(5), tract_geoid char(11), with regex + prefix CHECK constraints. Why: integer storage drops the leading zero (01001 → 1001). Verified: a malformed FIPS insert is rejected.

+ D-005 · One canonical cloud Postgres ACCEPTED

Decision: Supabase canonical (PostGIS, matches the existing stack), DuckDB local for analytics, GitHub for schema/metadata. Neon and Railway dropped. The schema runs unchanged on any of them.

+ D-006 · ai_reader role isolation ACCEPTED · PROVEN

Built and proven. The ai_reader role (NOLOGIN group) is granted USAGE + SELECT on serving only and revoked from gov/geo/topic; serving views expose data via owner's rights. tests/ai_reader_isolation.sql confirms on Postgres 16 / PostGIS 3.4 that ai_reader reads serving.* and is denied all four base tables — 5/5 pass. On Supabase: GRANT ai_reader TO anon, authenticated; (never service_role).

+ D-007 · Canonical geometry source ACCEPTED

Decision: PostGIS canonical, GeoParquet archive-only. No dual-write validation test needed, since Parquet is not a competing source of truth.

+ D-008 · Owner's manual is the in-repo canonical spec ACCEPTED

The specification now lives in the repo at docs/OWNERS_MANUAL.md with a provenance header, sourced from the "Data Project" Google Doc. The build narrative lives in docs/origins/. Why: a governed platform keeps its own spec under version control.

+ D-009 · Universal geography vs. optional Red Cross overlay ACCEPTED

The universal layer (county_boundary, tracts, topic.*, geometry) carries no chapter/region/division and serves any project. The Red Cross hierarchy lives only in geo.geography_assignment (temporal), joined on demand. Why: chapter assignment is RC-specific, has quirks (Fairfield County → ARC of Metro NY North), and changes over time — non-RC projects shouldn't see or pay for it. Result: two serving views — serving.county_reference (chapter-free) and serving.county_intelligence (with chapter).

+ D-010 · Hierarchy vintages dated to the RC fiscal-year boundary (July 1) ACCEPTED

Red Cross geography changes take effect on the fiscal-year boundary, July 1 (FY27 = 2026-07-01). The ingestion dates each hierarchy vintage to that boundary, not the upstream file's build date. Why: the CA realignment + SW Florida rename are effective when the fiscal year turns, not when the CSV was generated.

+ D-011 · Never trust a source's embedded chapter; re-enrich by FIPS ACCEPTED

Red Cross ArcGIS / operational sources are often behind and carry stale chapter/region/division values. The platform never stores a source's chapter — it derives chapter from the authoritative spine by county_fips (geography_at() for historical correctness). Topic tables structurally forbid a chapter column (D-001). Why: the source of truth for hierarchy is the governed spine, not whatever a layer embedded when it was last updated.

+ D-012 · Connecticut representation ACCEPTED · BUILT

CT abolished county government; the Census adopted 9 planning regions as county-equivalents (FIPS 09110–09190). Decision (revised by research): represent CT as the 9 planning regions, not the legacy 8 counties — 2022+ CT tract GEOIDs already embed the region code, so adding 9 rows to the spine rolls up all 884 CT tracts with no crosswalk. Geometry from Census 2024. Chapter mapping is approximate (regions don't align to RC chapter lines): Greater Bridgeport + Western Connecticut → ARC of Metro NY North (preserving the Fairfield→NY rule), other 7 → ARC of Connecticut. Legacy 8 kept only for pre-2022 data (OpenFEMA declarations). Built & verified: 884 tracts roll up, rejects 1,868 → 984.

+ D-013 · First-load effective-dating uses a baseline sentinel ACCEPTED · BUILT

A naive first load stamped every county effective_start = 2026-07-01, which made geography_at() return nothing for pre-FY27 dates — silently breaking the historical join the split spine exists to protect. Fix: unchanged counties get 1900-01-01 (baseline_sentinel, honest "unknown start"); the ~64 FY27-changed counties + 9 CT regions get 2026-07-01 (fy_boundary). New effective_start_basis column + a btree_gist EXCLUDE constraint banning overlapping spans. Verified: 3,076 baseline + 73 fy_boundary; a 2020 lookup resolves correctly.

+ D-014 · Source staleness is quarantined + reported, not trusted ACCEPTED · BUILT

RC ArcGIS/operational sources lag and ship stale embedded chapter/region/division. A source's claimed hierarchy may land ONLY in gov.source_assignment_observation (audit-only) — never a topic/serving table; chapter is always re-derived from the spine by county_fips. gov.source_vs_canonical_mismatch compares each claim to the assignment valid at the source's own vintage (stale vs. genuine conflict). Validation check #11 fails the load if any topic.* table grows a chapter/region/division column.

05

The Path

The first build sequence from the owner's manual, with the deviations baked in. Green is done; red is where we are now.

Create repo structure & ordered schemas

Four schemas, five SQL files, validation suite, intake form, Makefile.

DONE

Prove it on a live database

Applied clean on Postgres 16 + PostGIS 3.4; guards reject bad data; history resolves correctly.

DONE

Resolve decisions & wire the ai_reader gate

D-005 Supabase · D-007 PostGIS · D-006 built & proven — the serving wall is now enforced, not advisory.

DONE

Ingest the master geography from red-cross-data

3,140 county boundaries + a dated hierarchy vintage (FY27, eff. 2026-07-01) loaded from the canonical upstream repo. Leading-zero FIPS + chapter codes preserved.

DONE

Load the tract → county crosswalk

83,528 tracts loaded; 1,868 correctly rejected (CT planning regions, AK boroughs, PR) and logged — not force-loaded.

DONE

Load reference topics + serving views

ALICE · SVI · NRI · FEMA declarations + mobile homes, each per-source with provenance. serving.county_intelligence + chapter-free serving.county_reference. 10/10 validation, AI gate 5/5.

DONE

Deploy the governed schema to Supabase

Schema + AI gate applied to the "Geographic Intelligence Platform" project (additive; existing tables untouched). Gate verified holding.

DONE

Connecticut + effective-dating + staleness governance

9 CT planning regions added (884 tracts roll up); first-load baseline-sentinel fix; source-staleness quarantine + checks. Supabase schema synced to match.

DONE

Load the data into Supabase

Full platform loaded into the "Geographic Intelligence Platform" project — 3,149 counties, 84,412 tracts, 5 topics. Re-verified on Supabase: validation assertions all clean, AI gate holds.

DONE
10

Hand the repo to Codex for adversarial review

Against a committed schema + a real load (now satisfied) — only then add the next source.

YOU ARE HERE
06

Decisions Needed

All three gatekeeper calls are resolved (2026-06-25). The next move is the first real data load.

IDDecisionResolutionStatus
D-005Which cloud Postgres is canonical?Supabase canonical + DuckDB local; Neon/Railway dropped.RESOLVED
D-006Enforce the serving wall with ai_reader?Built + proven — grants serving only, revoked from the rest.RESOLVED
D-007Which store is canonical for geometry?PostGIS canonical, GeoParquet archive-only.RESOLVED

Next up: add Connecticut to the spine and load the data into Supabase (needs the project connection string). See Working Notes below.

07

White Paper

The full thinking behind the platform — the trust model, the architecture, and the governance contract — in one printable document.

A Governed Geographic Intelligence Platform

Why trust must be structural, how the four-layer architecture enforces it, and the path from raw files to AI-safe answers.

Read The White Paper (PDF) View The Repository