Phase 2b · v2026-04-28_232613
Lean blender — six per-NWP temperatures, their mean/std/range, and cyclical hour/day-of-year encodings (~13 features). Trained 2026-04-28. Metric: Test MAE (°C).
| Lead | Blend | Best single | Δ vs best |
|---|---|---|---|
| +24h | 0.533 | temp_aifs (0.563) | -5.3% |
| +48h | 0.642 | temp_aifs (0.661) | -2.8% |
| +72h | 0.792 | temp_aifs (0.829) | -4.4% |
| +96h | 0.963 | temp_aifs (0.943) | +2.1% |
| +120h | 1.129 | temp_aifs (1.140) | -1.0% |
Verify history (1 run)
Twice-weekly Brier/MAE on the held-out rolling window — one row per verify run, drift flag in the last column. Metric: MAE (°C). Version column names which trained model the row's numbers came from — a freshly retrained champion shows zero rows here for ~5-9d (one verify cycle plus 5d ERA5 latency), so a row labelled with an older version is the previous lineage's history under the same phase.
| Run (UTC) | Version | +24h | +48h | +72h | Drift |
|---|---|---|---|---|---|
v2026-04-21_201231_phase2redo |
0.959 | 0.943 | 0.888 | ⚠ |