Data Status & Transparency
Data sources, freshness, pipeline status, and known limitations
Data sources, freshness, pipeline status, and known limitations
2 Critical Limitations
FEMA FIMA NFIP Redacted Claims
National Flood Insurance Program claims data, redacted for public release. Derived from NFIP system of record.
NOAA Billion-Dollar Weather Disasters
Curated NOAA/NCEI research product. Annual versioned snapshot, not a real-time feed.
Market Analytics Data
Synthetic demonstration data generated at runtime. Does not represent real insurance market statistics.
Mixed Data Sources Warning
Some views may include synthetic market data for demonstration. Do not treat synthetic values as real market statistics.
When data was last ingested and expected update schedules
Public source data is redacted before ingestion. No direct policyholder names, addresses, or policy numbers are ingested from FEMA claims data. The FEMA FIMA NFIP dataset is officially redacted at source per FEMA documentation, derived from the NFIP system of record.
Ingest Controls Not Authenticated
These endpoints are currently accessible without authentication. Mark as admin/dev only until auth and rate limiting are implemented.
Stages grayed out are defined but not yet active in the current build
Stages grayed out are defined but not yet active in the current build
Stages grayed out are defined but not yet active in the current build
Shows which tables are populated by current pipelines
| Table | Source | Status | Notes |
|---|---|---|---|
claims | FEMA NFIP | Populated | Core claim fields populated from FEMA API |
disaster_events | NOAA | Partial | Most fields populated Missing: states (always empty), insured_losses, fema_declaration_id |
market_data | Generated | Synthetic | All values are synthetically generated for demonstration |
pipeline_runs | System | Populated | Tracks all ingestion operations |
claim_payments | None | Unused | Schema exists but no ingestion populates it |
The parseStates() function exists but is not called during ingestion. Disaster events cannot be correlated to specific states.
Affects: Disaster correlation
The enrichClaimsWithDisasterCorrelation() function exists but has no API endpoint to trigger it.
Affects: Claims-to-disaster linking
The claim_payments schema exists but no ingestion pipeline populates it.
Affects: Payment tracking
Iowa market data is generated using Math.random() and does not represent real insurance statistics.
Affects: Market analytics
Validation and normalization utilities in lib/transforms/ exist but may not be applied in all ingestion flows.
Affects: Data quality
The /api/ingest/* routes do not require authentication. Access should be restricted.
Affects: Security
Source: FEMA NFIP API
| Field | Origin | Source / Transform |
|---|---|---|
stateState | Normalized | Source: stateTransform: substring(0,2).toUpperCase() |
countyCounty | Raw | Source: reportedCity |
dateOfLossDate of Loss | Normalized | Source: dateOfLossTransform: split('T')[0] |
floodZoneFlood Zone | Raw | Source: floodZone |
buildingDamageBuilding Damage | Raw | Source: amountPaidOnBuildingClaim |
contentsDamageContents Damage | Raw | Source: amountPaidOnContentsClaim |
totalPayoutTotal Payout | Calculated | Calc: amountPaidOnBuildingClaim + amountPaidOnContentsClaim |
occupancyTypeOccupancy Type | Normalized | Source: occupancyType (numeric)Transform: parseOccupancyType() → label |
yearBuiltYear Built | Normalized | Source: originalConstructionDateTransform: parseYear() → first 4-digit year |
disasterEventIdLinked Disaster | Enriched | Transform: correlateClaimToDisaster() — NOT ACTIVE |
Source: NOAA CSV
| Field | Origin | Source / Transform |
|---|---|---|
eventTypeEvent Type | Normalized | Source: DisasterTransform: normalizeEventType() |
nameEvent Name | Raw | Source: Name |
startDateStart Date | Normalized | Source: Begin DateTransform: parseDate() — YYYYMMDD or MM/DD/YYYY |
endDateEnd Date | Normalized | Source: End DateTransform: parseDate() |
statesAffected States | Raw | Source: States columnTransform: parseStates() — NOT CALLED (always empty) |
totalCostBillionsTotal Cost (Billions) | Calculated | Source: CPI-Adjusted Cost (millions)Calc: value / 1000 |
Source: Generated (not real data)
| Field | Origin | Source / Transform |
|---|---|---|
companyCompany | Generated | Transform: Random selection from SAMPLE_COMPANIES |
stateState | Generated | Transform: Random selection from STATES |
premiumsWrittenPremiums Written | Generated | Calc: basePremium × yearFactor × stateFactor (Math.random()) |
lossesPaidLosses Paid | Generated | Calc: premiumsWritten × lossRatioBase (Math.random()) |
lossRatioLoss Ratio | Generated | Calc: 0.5 + Math.random() × 0.3 |