WeatherBlend

Multi-model forecast blending for Bonehill Rocks, Dartmoor

Temperature models

2b (lean) + 2c (rich). MAE °C, lower better. Δ vs best single NWP — negative = blend wins.

Temperature — Bonehill Rocks

Phase 2c · v2026-06-15_083940_phase2c Δ +0.000 vs prev train

Rich blender, 88 features (adds dew/RH/cloud/wind/pressure). Trained 2026-06-15. Metric: Test MAE (°C).

Lead Blend Best single Δ vs best
+24h 0.487 temp_aifs (0.585) -16.8%
+48h 0.582 temp_aifs (0.658) -11.6%
+72h 0.686 temp_aifs (0.787) -12.8%
+96h 0.843 temp_aifs (0.913) -7.6%
+120h 1.016 temp_aifs (1.052) -3.4%
Verify history (16 runs)

Mon + Thu rolling MAE (°C). Per-lead cells turn red when the rolling metric breaches the lead-specific drift threshold; check the verify report for the per-cell breakdown. Version column names the trained model — a fresh champion takes ~5-9d to show.

Run (UTC) Version N +24h+48h+72h+96h+120h
v2026-06-01_233346_phase2c
v2026-06-07_131622_phase2c
528 0.597
temp_icon: 0.524
0.382
temp_aifs: 0.468
0.337
temp_mf: 0.578
0.663
temp_aifs: 0.590
0.499
temp_aifs: 0.542
v2026-06-01_233346_phase2c
v2026-06-07_131622_phase2c
528 0.692
temp_aifs: 0.558
0.426
temp_aifs: 0.490
0.478
temp_ecmwf: 0.648
0.621
temp_ecmwf: 0.750
0.534
temp_aifs: 0.725
v2026-06-01_233346_phase2c 432 0.468
temp_aifs: 0.403
0.435
temp_aifs: 0.503
0.566
temp_ecmwf: 0.723
0.642
temp_mf: 0.720
0.949
temp_gfs: 0.871
v2026-05-26_104346_phase2c
v2026-06-01_233346_phase2c
528 0.485
temp_aifs: 0.354
0.325
temp_aifs: 0.401
0.417
temp_aifs: 0.238
0.614
temp_aifs: 0.724
0.490
temp_aifs: 0.745
v2026-05-26_104346_phase2c
v2026-05-31_201130_phase2c
528 0.284
temp_aifs: 0.338
0.521
temp_ecmwf: 0.753
0.580
temp_aifs: 0.746
0.734
temp_aifs: 0.718
0.646
temp_aifs: 0.711
v2026-05-17_173626_phase2c
v2026-05-24_130333_phase2c
v2026-05-26_104346_phase2c
744 0.736
temp_aifs: 0.927
0.707
temp_gfs: 0.994
0.850
temp_aifs: 1.137
1.021
temp_ecmwf: 1.252
1.633
temp_aifs: 1.749
v2026-05-17_173626_phase2c
v2026-05-24_130333_phase2c
744 0.835
temp_ecmwf: 1.164
0.919
temp_ecmwf: 1.035
1.108
temp_ecmwf: 1.401
1.320
temp_ukmo: 1.773
1.604
temp_aifs: 1.876
v2026-04-28_232637_phase2c
v2026-05-17_173626_phase2c
1600 0.349
temp_aifs: 0.429
0.367
temp_aifs: 0.477
0.465
temp_icon: 0.413
0.558
temp_icon: 0.575
0.741
temp_ecmwf: 0.767
v2026-04-28_232637_phase2c
v2026-05-17_173626_phase2c
2012 0.550
temp_ukmo: 0.396
0.477
temp_icon: 0.660
0.744
temp_ecmwf: 0.767
0.796
temp_ecmwf: 0.891
0.769
temp_ecmwf: 0.795
v2026-04-28_232637_phase2c 2012 0.433
temp_icon: 0.566
0.565
temp_ecmwf: 0.775
0.780
temp_aifs: 0.899
0.836
temp_aifs: 0.945
0.958
temp_ecmwf: 1.069
v2026-04-28_232637_phase2c 1694 0.455
temp_aifs: 0.586
0.635
temp_ecmwf: 0.772
0.696
temp_aifs: 0.782
1.029
temp_aifs: 1.098
1.290
temp_aifs: 1.147
v2026-04-28_232637_phase2c 388 0.509
temp_aifs: 0.598
0.597
temp_ecmwf: 0.689
0.615
temp_aifs: 0.703
0.908
temp_aifs: 0.764
0.630
temp_ecmwf: 0.778
v2026-04-28_232637_phase2c 173 0.462
temp_aifs: 0.558
0.789
temp_aifs: 0.798
0.940
temp_ecmwf: 0.805
1.110
temp_aifs: 0.814
0.623
temp_ecmwf: 0.944
v2026-04-27_191724_phase2c
v2026-04-27_225213_phase2c
v2026-04-28_232637_phase2c
9 0.800
temp_aifs: 0.556
0.823
temp_gem: 0.717
1.104
temp_gem: 0.000
v2026-04-23_132453_phase2c
v2026-04-27_181257_phase2c
v2026-04-27_191724_phase2c
9 0.735
temp_mf: 0.000
1.616
temp_icon: 0.000
1.047
temp_ecmwf: 1.189
v2026-04-23_132453_phase2c 10 0.834
temp_mf: 0.740
0.904
temp_gfs: 0.580
By actual NWP forecast lead (6h buckets)

Same data grouped by ValidTime − freshest contributing NWP cycle (6h buckets) instead of trained-lead label. Reveals MAE structure within a trained bucket once predict spread to hourly outputs (2026-05-04+). Buckets start at the trained lead — earlier figures measured from the cron-fire time, which made offset-day models look like sub-lead forecasts.

Run (UTC) -12--7h0-5h6-11h12-17h18-23h24-29h30-35h36-41h42-47h48-53h54-59h60-65h66-71h72-77h78-83h84-89h90-95h96-101h102-107h108-113h114-119h120-125h126-131h132-137h138-143h
0.6070.5380.5960.6880.3840.4260.3460.2100.2880.5260.4220.4990.6420.8900.6010.4970.4710.4700.5141.091
0.7240.3390.4320.3440.3950.5060.4970.4500.4350.5840.6060.4840.5800.7140.7800.6150.5020.5490.6540.792
0.4910.4410.3910.3460.4280.4640.4010.5440.5060.7270.6940.7830.5470.9030.9161.0590.9490.4260.4080.562
0.5300.4470.2970.2320.3200.2590.3260.7050.4170.4010.5000.3500.6670.4510.5480.4130.5110.4260.4080.562
0.2840.5500.4920.4170.5570.4270.4630.3100.6450.3560.5240.3100.8130.4690.6520.3080.6990.5680.3800.433
0.7310.7820.6560.8790.6830.7231.0221.1840.8520.7731.1760.3851.0780.3951.0890.7131.5951.9401.8370.688
0.7340.7171.0080.8540.8180.9100.8931.0990.4370.1760.9711.0101.3091.4541.2751.2961.0391.1691.5051.5841.7132.0801.1910.851
0.2410.2980.3760.3060.3240.3890.3680.3100.4040.5100.4710.4960.5510.4520.5280.5260.1330.0560.7120.6390.7720.9950.7610.698
0.2140.3961.1360.4270.3800.2670.3790.4630.5090.6680.7310.7190.7610.7970.7130.7710.8580.6820.6630.6950.7721.0380.8550.988
0.3100.3910.3840.4540.5000.4670.5380.5470.6140.7000.7850.6770.7940.8940.7850.7920.9040.9290.7480.8700.9051.0551.1551.291
0.4280.4330.4040.4550.4910.5230.6260.5700.6400.6360.6800.6070.8780.9740.9140.9201.1021.2811.0901.0671.3411.4831.5821.559
0.7110.4620.4910.5910.4600.4920.5030.6690.4260.5900.4810.6390.6310.6210.7240.8940.630
0.2010.1870.2570.6620.3110.3650.3860.7322.2520.9401.1100.623
1.7710.1090.8000.2891.0901.104
1.7711.6160.7350.8511.047

Phase 2b · v2026-06-14_134518 Δ -0.014 vs prev train

Lean blender, 13 features. Trained 2026-06-14. Metric: Test MAE (°C).

Lead Blend Best single Δ vs best
+24h 0.556 temp_aifs (0.585) -5.0%
+48h 0.641 temp_aifs (0.658) -2.6%
+72h 0.733 temp_aifs (0.787) -6.9%
+96h 0.865 temp_aifs (0.913) -5.3%
+120h 1.020 temp_aifs (1.052) -3.0%
Verify history (16 runs)

Mon + Thu rolling MAE (°C). Per-lead cells turn red when the rolling metric breaches the lead-specific drift threshold; check the verify report for the per-cell breakdown. Version column names the trained model — a fresh champion takes ~5-9d to show.

Run (UTC) Version N +24h+48h+72h+96h+120h
v2026-06-05_091025
v2026-06-07_131530
384 0.564
temp_icon: 0.524
0.443
temp_aifs: 0.468
0.466
temp_mf: 0.578
0.696
temp_aifs: 0.590
0.516
temp_aifs: 0.415
v2026-06-01_214702
v2026-06-05_091025
v2026-06-07_131530
336 0.694
temp_aifs: 0.558
0.442
temp_aifs: 0.472
0.545
temp_aifs: 0.547
0.592
temp_ecmwf: 0.817
0.540
temp_aifs: 0.767
v2026-06-01_214702
v2026-06-05_091025
336 0.354
temp_aifs: 0.522
0.384
temp_aifs: 0.502
0.527
temp_aifs: 0.725
0.640
temp_mf: 0.764
0.835
temp_gfs: 0.871
v2026-05-26_104311
v2026-06-01_214702
528 0.458
temp_aifs: 0.354
0.295
temp_aifs: 0.416
0.565
temp_aifs: 0.238
0.617
temp_aifs: 0.724
0.463
temp_aifs: 0.745
v2026-05-26_104311
v2026-05-31_201040
528 0.285
temp_aifs: 0.338
0.636
temp_ecmwf: 0.753
0.699
temp_aifs: 0.746
0.776
temp_aifs: 0.718
0.670
temp_aifs: 0.711
v2026-05-17_173452
v2026-05-24_130255
v2026-05-26_104311
744 1.041
temp_aifs: 0.927
0.668
temp_gfs: 0.994
1.139
temp_aifs: 1.137
0.869
temp_ecmwf: 1.252
1.829
temp_aifs: 1.749
v2026-05-17_173452
v2026-05-24_130255
744 1.309
temp_ecmwf: 1.164
1.477
temp_ecmwf: 1.035
1.351
temp_ecmwf: 1.401
1.655
temp_gfs: 1.812
1.851
temp_aifs: 1.876
v2026-04-28_232613
v2026-05-17_173452
1600 0.424
temp_aifs: 0.429
0.418
temp_aifs: 0.477
0.491
temp_icon: 0.413
0.682
temp_icon: 0.575
0.672
temp_ecmwf: 0.767
v2026-04-28_232613
v2026-05-17_173452
2012 0.452
temp_icon: 0.413
0.489
temp_icon: 0.660
0.725
temp_ecmwf: 0.767
0.783
temp_ecmwf: 0.891
0.731
temp_ecmwf: 0.795
v2026-04-28_232613 2012 0.511
temp_icon: 0.566
0.609
temp_ecmwf: 0.775
0.823
temp_aifs: 0.899
0.857
temp_aifs: 0.945
0.976
temp_ecmwf: 1.069
v2026-04-28_232613 1694 0.516
temp_aifs: 0.586
0.688
temp_ecmwf: 0.772
0.790
temp_aifs: 0.782
1.078
temp_aifs: 1.098
1.360
temp_aifs: 1.147
v2026-04-28_232613 388 0.561
temp_aifs: 0.598
0.637
temp_ecmwf: 0.689
0.619
temp_aifs: 0.703
0.897
temp_aifs: 0.764
0.495
temp_ecmwf: 0.778
v2026-04-28_232613 173 0.576
temp_aifs: 0.558
0.813
temp_aifs: 0.798
0.906
temp_ecmwf: 0.805
1.013
temp_aifs: 0.814
0.527
temp_ecmwf: 0.944
v2026-04-27_204454
v2026-04-28_232613
9 1.202
temp_aifs: 0.556
1.076
temp_gem: 0.629
0.606
temp_gem: 0.700
v2026-04-21_201231_phase2redo
v2026-04-27_174346
v2026-04-27_204454
13 0.182
temp_icon: 0.000
1.580
temp_icon: 0.000
0.962
temp_gem: 1.054
v2026-04-21_201231_phase2redo 14 0.959
temp_mf: 0.743
0.943
temp_ecmwf: 1.122
0.888
temp_gem: 0.375
By actual NWP forecast lead (6h buckets)

Same data grouped by ValidTime − freshest contributing NWP cycle (6h buckets) instead of trained-lead label. Reveals MAE structure within a trained bucket once predict spread to hourly outputs (2026-05-04+). Buckets start at the trained lead — earlier figures measured from the cron-fire time, which made offset-day models look like sub-lead forecasts.

Run (UTC) -12--7h-6--1h0-5h6-11h12-17h18-23h24-29h30-35h36-41h42-47h48-53h54-59h60-65h66-71h72-77h78-83h84-89h90-95h96-101h102-107h108-113h114-119h120-125h126-131h132-137h138-143h
0.5850.4810.5520.5790.4640.4530.3350.2250.4420.6010.4510.4950.6780.8900.3970.5490.4370.5710.6462.227
0.7150.4670.5320.4380.3880.6480.5240.3060.5260.7680.2720.4770.5320.7590.7790.6800.5520.5010.4560.634
0.3320.4700.3150.2720.3830.3860.3380.4990.4960.6180.5880.7090.5251.0860.9851.2650.8350.4250.3520.393
0.5000.3860.2960.1930.2990.2630.2330.4790.5650.4050.5210.2220.6880.4160.4940.3390.4860.4250.3520.393
0.2850.5690.4960.3710.6910.4830.5850.2630.7740.4430.6910.2660.8680.4800.6410.3260.6870.7210.4520.556
1.1120.9770.5900.8170.6560.5761.0641.2201.2040.8381.3270.4110.9170.3471.6830.8851.7402.2702.1250.949
2.1621.5911.4461.2441.2221.3361.2781.4340.7700.9651.2671.2901.5361.5741.5441.7301.3351.5611.6941.7852.0892.3891.4551.055
0.3860.4080.4170.3590.4250.4100.4650.4210.4330.4480.4570.5410.7110.5410.6040.5480.5600.2610.6290.5910.7780.8030.5840.713
0.2660.3321.0500.2040.0160.2260.4100.5090.5160.7030.7510.7030.7360.7160.6890.7540.8520.6490.6540.6680.8290.8760.6880.804
0.4760.5750.4990.5780.5330.5130.6080.6070.6570.7880.8750.7180.7650.8670.7740.7440.9460.9330.8320.8860.9331.0311.1581.134
0.6460.6420.5350.5300.4870.5750.7090.6210.6790.7250.7720.6750.8760.9780.8910.9431.2231.3621.2131.0761.3971.6801.7381.618
0.8510.6340.5430.6560.4850.5230.5490.6910.5020.5970.5250.6280.5360.6570.7030.8830.495
0.6600.4510.4510.7560.3760.4270.4730.7660.6062.3540.9061.0130.527
0.9770.9770.1820.3921.2020.0570.5851.5770.6061.950
0.9770.9770.1821.5800.2280.9930.962

Phase 2d · v2026-06-14_134717_phase2d Δ -0.009 vs prev train

Exact-runtime blender. Trains on raw S3 cycles (GFS + AIFS required, IFS oper + MO Global + UKV optional) instead of Open-Meteo offset_day, with rigorous (RunTime, ValidTime, Lead) provenance per row. UKV pulled per-V-hour from 03Z + 15Z cycles. Trained 2026-06-14. Metric: Test MAE (°C).

Lead Blend Best single Δ vs best
+12h 0.464 temp_ifs (1.051) -55.9%
+24h 0.479 temp_ifs (1.090) -56.0%
+48h 0.560 temp_ifs (1.148) -51.2%
+72h 0.770 temp_moglobal (1.268) -39.3%
+96h 0.924 temp_ifs (1.188) -22.2%
+120h 1.122 temp_ifs (1.350) -16.9%
Verify history (12 runs)

Mon + Thu rolling MAE (°C). Per-lead cells turn red when the rolling metric breaches the lead-specific drift threshold; check the verify report for the per-cell breakdown. Version column names the trained model — a fresh champion takes ~5-9d to show.

Run (UTC) Version N +12h+24h+48h+72h+96h+120h
v2026-05-09_084413_phase2d
v2026-06-07_131721_phase2d
186 0.394
0.383
0.425
0.524
0.600
0.584
v2026-05-09_084413_phase2d
v2026-06-07_131721_phase2d
194 0.368
0.334
0.333
0.542
0.601
0.646
v2026-05-09_084413_phase2d
v2026-05-31_201225_phase2d
v2026-06-07_131721_phase2d
235 0.219
0.330
0.331
0.531
0.621
0.795
v2026-05-09_084413_phase2d
v2026-05-31_201225_phase2d
212 0.346
0.322
0.306
0.550
0.588
0.836
v2026-05-09_084413_phase2d
v2026-05-31_201225_phase2d
197 0.459
0.383
0.297
1.033
1.108
1.379
v2026-05-09_084413_phase2d
v2026-05-26_104427_phase2d
128 1.292
1.058
0.891
1.666
1.802
1.999
v2026-05-09_084413_phase2d
v2026-05-24_130417_phase2d
v2026-05-26_104427_phase2d
83 1.990
1.898
2.192
1.966
1.783
1.743
v2026-05-09_084413_phase2d
v2026-05-17_173814_phase2d
24 0.366
0.582
0.453
0.431
0.288
0.299
v2026-05-08_060055_phase2d
v2026-05-09_083929_phase2d
v2026-05-09_084413_phase2d
v2026-05-17_173814_phase2d
110 0.650
0.632
0.482
0.493
0.655
0.671
v2026-05-08_055817_phase2d
v2026-05-08_060055_phase2d
91 0.334
0.464
0.432
0.689
0.866
0.675
v2026-05-08_055817_phase2d
v2026-05-08_060055_phase2d
45 0.227
0.205
0.460
0.564
v2026-05-05_182234_phase2d 12 0.376
0.697
By actual NWP forecast lead (6h buckets)

Same data grouped by ValidTime − freshest contributing NWP cycle (6h buckets) instead of trained-lead label. Reveals MAE structure within a trained bucket once predict spread to hourly outputs (2026-05-04+). Buckets start at the trained lead — earlier figures measured from the cron-fire time, which made offset-day models look like sub-lead forecasts.

Run (UTC) -48--43h-42--37h-36--31h-30--25h-24--19h-18--13h-12--7h-6--1h0-5h6-11h12-17h18-23h24-29h30-35h36-41h42-47h48-53h54-59h60-65h66-71h72-77h78-83h84-89h90-95h96-101h102-107h108-113h
1.1301.0521.8692.3041.7821.6062.0423.3141.9991.6561.9842.3662.2522.3792.6472.6161.9601.5221.9922.7521.9151.3751.8393.8361.7081.1611.862
0.1860.1260.2340.1950.2590.0680.4350.5460.6770.5310.3700.4140.5340.3280.4180.4900.3820.2690.2610.2840.2200.368
0.8100.6060.7900.7900.4340.5530.4450.1420.2600.7650.6400.7570.5670.7660.6460.660
0.8100.3530.4380.4790.3360.5020.4000.6320.8120.5981.2370.9890.6610.7870.6320.697
0.2170.1220.3010.4530.5170.3980.8270.5080.514
0.5550.6000.607