WeatherBlend

Multi-model forecast blending for Bonehill Rocks, Dartmoor

Rain models

Per-station P(wet ≥ 0.1 mm/h) classifiers — Phase 3a (lean, 27 features) and Phase 3c (rich, 55 features). Trained against EA Hydrology gauges, scored as Brier — lower is better. The Δ column compares the blend to the best single NWP on the same test slice; negative means the blend wins.

Precipitation — Bellever Dartmoor

Phase 3a · v2026-04-28_232709

Lean P(wet ≥ 0.1 mm/h) classifier — six per-NWP precipitation rates and ensemble agreement. Trained 2026-04-28. Metric: Test Brier.

Lead Blend Best single Δ vs best
+24h 0.127 precip_mf (0.212) -40.2%
+48h 0.139 precip_ecmwf (0.246) -43.2%
+72h 0.155 precip_ecmwf (0.266) -41.8%
+96h 0.164 precip_jma (0.279) -41.1%
+120h 0.181 precip_icon (0.319) -43.3%
Verify history (1 run)

Twice-weekly Brier/MAE on the held-out rolling window — one row per verify run, drift flag in the last column. Metric: Brier. Version column names which trained model the row's numbers came from — a freshly retrained champion shows zero rows here for ~5-9d (one verify cycle plus 5d ERA5 latency), so a row labelled with an older version is the previous lineage's history under the same phase.

Run (UTC) Version +24h+48h+72h Drift
v2026-04-23_071842
v2026-04-27_175435
0.0000.0170.002

Phase 3c · v2026-04-28_232840_phase3c

Rich P(wet) classifier — adds per-NWP cloud, humidity, CAPE, dew-point depression with feature-importance pruning (~55 features). Trained 2026-04-28. Metric: Test Brier.

Lead Blend Best single Δ vs best
+24h 0.120 precip_mf (0.212) -43.6%
+48h 0.137 precip_ecmwf (0.246) -44.0%
+72h 0.149 precip_ecmwf (0.266) -43.8%
+96h 0.159 precip_jma (0.279) -42.8%
+120h 0.173 precip_icon (0.319) -45.8%
Verify history (1 run)

Twice-weekly Brier/MAE on the held-out rolling window — one row per verify run, drift flag in the last column. Metric: Brier. Version column names which trained model the row's numbers came from — a freshly retrained champion shows zero rows here for ~5-9d (one verify cycle plus 5d ERA5 latency), so a row labelled with an older version is the previous lineage's history under the same phase.

Run (UTC) Version +24h+48h+72h Drift
v2026-04-23_154405_phase3c 0.0000.0000.001

Precipitation — Bovey Tracey

Phase 3a · v2026-05-03_233357

Lean P(wet ≥ 0.1 mm/h) classifier — six per-NWP precipitation rates and ensemble agreement. Trained 2026-05-03. Metric: Test Brier.

Lead Blend Best single Δ vs best
+24h 0.102 precip_icon (0.176) -42.3%
+48h 0.117 precip_mf (0.194) -39.4%
+72h 0.130 precip_gfs (0.224) -42.1%
+96h 0.136 precip_icon (0.236) -42.4%
+120h 0.153 precip_icon (0.297) -48.3%
Verify history (no runs yet)

No verify rows on disk match this card's phase (3a). Either the next verify (twice-weekly Mon + Thu 09:30 UTC, then 5d ERA5 latency) hasn't yet scored predictions made by this version, or older verify files used a different phase tag for this lineage. Re-check after the next Mon/Thu cycle.

Phase 3c · v2026-05-03_233604_phase3c

Rich P(wet) classifier — adds per-NWP cloud, humidity, CAPE, dew-point depression with feature-importance pruning (~55 features). Trained 2026-05-03. Metric: Test Brier.

Lead Blend Best single Δ vs best
+24h 0.097 precip_icon (0.176) -45.0%
+48h 0.115 precip_mf (0.194) -40.7%
+72h 0.127 precip_gfs (0.224) -43.5%
+96h 0.132 precip_icon (0.236) -43.9%
+120h 0.143 precip_icon (0.297) -51.8%
Verify history (no runs yet)

No verify rows on disk match this card's phase (3c). Either the next verify (twice-weekly Mon + Thu 09:30 UTC, then 5d ERA5 latency) hasn't yet scored predictions made by this version, or older verify files used a different phase tag for this lineage. Re-check after the next Mon/Thu cycle.

Precipitation — Dartmoor Nr Hexworthy

Phase 3a · v2026-04-28_232809

Lean P(wet ≥ 0.1 mm/h) classifier — six per-NWP precipitation rates and ensemble agreement. Trained 2026-04-28. Metric: Test Brier.

Lead Blend Best single Δ vs best
+24h 0.141 precip_ecmwf (0.227) -37.8%
+48h 0.156 precip_ecmwf (0.257) -39.4%
+72h 0.166 precip_ecmwf (0.277) -40.2%
+96h 0.175 precip_icon (0.290) -39.9%
+120h 0.189 precip_gem (0.346) -45.3%
Verify history (1 run)

Twice-weekly Brier/MAE on the held-out rolling window — one row per verify run, drift flag in the last column. Metric: Brier. Version column names which trained model the row's numbers came from — a freshly retrained champion shows zero rows here for ~5-9d (one verify cycle plus 5d ERA5 latency), so a row labelled with an older version is the previous lineage's history under the same phase.

Run (UTC) Version +24h+48h+72h Drift
v2026-04-23_163848
v2026-04-26_085202
v2026-04-26_184501
0.0020.0070.000

Phase 3c · v2026-04-28_232945_phase3c

Rich P(wet) classifier — adds per-NWP cloud, humidity, CAPE, dew-point depression with feature-importance pruning (~55 features). Trained 2026-04-28. Metric: Test Brier.

Lead Blend Best single Δ vs best
+24h 0.136 precip_ecmwf (0.227) -40.3%
+48h 0.152 precip_ecmwf (0.257) -40.6%
+72h 0.165 precip_ecmwf (0.277) -40.5%
+96h 0.175 precip_icon (0.290) -39.8%
+120h 0.190 precip_gem (0.346) -45.1%
Verify history (1 run)

Twice-weekly Brier/MAE on the held-out rolling window — one row per verify run, drift flag in the last column. Metric: Brier. Version column names which trained model the row's numbers came from — a freshly retrained champion shows zero rows here for ~5-9d (one verify cycle plus 5d ERA5 latency), so a row labelled with an older version is the previous lineage's history under the same phase.

Run (UTC) Version +24h+48h+72h Drift
v2026-04-23_154459_phase3c 0.0000.0010.001