ebook img

Probabilistic wind speed forecasting in Hungary PDF

0.42 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Probabilistic wind speed forecasting in Hungary

Probabilistic wind speed forecasting in Hungary Sa´ndor Baran Do´ra Nemoda and 2 1 Faculty of Informatics, University of Debrecen 0 2 Kassai u´t 26, H–4028 Debrecen, Hungary r a M Andra´s Hora´nyi 7 1 Hungarian Meteorological Service ] P P.O. Box 38, H–1525 Budapest, Hungary A . t a t s Abstract [ 3 Prediction of various weather quantities is mostly based on deterministic numerical v 2 weather forecasting models. Multiple runs of these models with different initial con- 4 ditions result ensembles of forecasts which are applied for estimating the distribution 4 4 of future weather quantities. However, the ensembles are usually under-dispersive and . 2 uncalibrated, so post-processing is required. 0 In the present work Bayesian Model Averaging (BMA) is applied for calibrating 2 1 ensembles of wind speed forecasts produced by the operational Limited Area Model : v Ensemble Prediction System of the Hungarian Meteorological Service (HMS). i X We describe two possible BMA models for wind speed data of the HMS and show r that BMA post-processing significantly improves the calibration and precision of fore- a casts. Key words: Bayesian Model Averaging, gamma distribution, continuous ranked prob- ability score. 1 Introduction The aim of weather forecasting is to give a good prediction of the future states of the atmosphere on the basis of present observations and mathematical models describing the dynamics (physical behaviour) of the atmosphere. These models consist of sets of non- linear partial differential equations which have only numerical solutions. The problem with these numerical weather prediction models is that the solutions highly depend on the initial 1 2 conditions which are always in a way or in another not fully accurate. A possible solution to address this problem is to run the model with different initial conditions and produce ensembles offorecasts. Withthehelpofensembles onecanestimatethedistributionoffuture weather variables which leads us to probabilistic weather forecasting (Gneiting and Raftery, 2005). The ensemble prediction method was proposed by Leith (1974) and since its first operational implementation (Buizza et al., 1993; Toth and Kalnay, 1997) it became a widely used technique all over the world. However, despite e.g. the ensemble mean gives a better estimateofameteorologicalquantitythanmostoralloftheensemble members, theensemble is usually under-dispersive and in this way, uncalibrated. This phenomena was observed at severaloperationalensemblepredictionsystems, foranoverviewseee.g. Buizza et al.(2005). The Bayesian model averaging (BMA) method for post-processing ensembles in order to calibrate them was introduced by Raftery et al. (2005). The basic idea of BMA is that to eachensemble member forecastcorrespondsaconditionalprobabilitydensity function(PDF) that can be interpreted as the conditional PDF of the future weather quantity provided the considered forecast is the best one. Then the BMA predictive PDF of the future weather quantity is the weighted sum of the individual PDFscorresponding to the ensemble members and the weights are based on the relative performances of the ensemble members during a given training period. In Raftery et al. (2005) the BMA method was successfully applied to obtain 48 hour forecasts of surface temperature and sea level pressure in the North Ameri- can Pacific Northwest based on the 5 members of the University of Washington Mesoscale Ensemble (Grimit and Mass, 2002). These weather quantities can be modeled by normal distributions, so the predictive PDF is a Gaussian mixture. Later Sloughter et al. (2007) developed a discrete-continuous BMA model for precipitation forecasting, where the discrete part corresponds to the event of no precipitation, while the cubic root of the precipitation amount (if it is positive) is modeled by a gamma distribution. In Sloughter et al. (2010) the BMA method was used for wind speed forecasting and the component PDFs follow gamma distribution. Finally, using von Mises distribution to model angular data Bao et al. (2010) introduced a BMA scheme to predict surface wind direction. In the present work we apply the BMA method for calibrating ensemble forecasts of wind speed produced by the operational Limited Area Model Ensemble Prediction Sys- tem (LAMEPS) of the Hungarian Meteorological Service (HMS) called ALADIN-HUNEPS (H´agel, 2010; Hor´anyi et al., 2011). ALADIN-HUNEPS covers a large part of Continental Europe with a horizontal resolution of 12 km and it is obtained by dynamical downscal- ing (by the ALADIN limited area model) of the global ARPEGE based PEARP system of M´et´eo France (Hor´anyi et al., 2006; Descamps et al., 2009). The ensemble consists of 11 members, 10 initialized from perturbed initial conditions and one control member from the unperturbed analysis. This construction implies that the ensemble contains groups of ex- changeable forecasts (the ensemble members cannot be distinguished), so for post-processing one has to use the modification of BMA as suggested by Fraley et al. (2010). 3 Verification Rank Histogram 0 2 0. cy 15 n 0. e u q e r 0 ve f 0.1 ati el R 5 0 0. 0 0 0. 1 22 3 44 5 66 7 88 9 10 11 12 Rank of Observation in Ensemble Figure 1: Verification rank histogram of the 11-member ALADIN-HUNEPS ensemble. Pe- riod: October 1, 2010 – March 25, 2011. 2 Data As it was mentioned in the Introduction, BMA post-processing of ensemble predictions was appliedforwindspeeddataobtainedfromtheHMS.Thedatafilecontains11memberensem- bles (10 forecasts started from perturbed initial conditions and one control) of 42 hour fore- casts for 10 meter wind speed (given in m/s) for 10 major cities in Hungary (Miskolc, Szom- bathely, Gy˝or, Budapest, Debrecen, Ny´ıregyha´za, Nagykanizsa, P´ecs, Kecskem´et, Szeged) produced by the ALADIN-HUNEPS system of the HMS, together with the corresponding validating observations, for the period between October 1, 2010 and March 25, 2011. The forecasts are initialized at 18 UTC, the startup speed of the anemometers measuring the validating observations is 0.1 m/s. The data set is fairly complete, since there are only two days (18.10.2010 and 15.02.2011) when three ensemble members are missing for all sites and one day (20.11.2010) when no forecasts are available. Figure 1 shows the verification rank histogram of the raw ensemble, that is the histogram ofranksofvalidatingobservationswithrespecttothecorrespondingensembleforecasts. This histogram is far from the desired uniform distribution, in most of the cases the ensemble members either underestimate, or overestimate the validating observations (the ensemble range contains the observed wind speed only in 61.21% of the cases). Hence, the ensemble is under-dispersive and in this way it is uncalibrated. 4 3 The model and diagnostics To obtain a probabilistic forecast of wind speed the modification of BMA gamma model of Sloughter et al. (2010) for ensembles with exchangeable members (Fraley et al., 2010) was used. The first idea is to have two exchangeable groups: one contains the control denoted by f , the other one the 10 ensemble members corresponding to the different perturbed c initial conditions which are denoted by f ,...,f , respectively. In this way we assume ℓ,1 ℓ,10 that the probability density function (PDF) of the forecasted wind speed x equals: p(x|f ,f ,...,f ;b ,b ,c ,c ) =ωg(x;f ,b ,b ,c ,c ) (3.1) c ℓ,1 ℓ,10 0 1 0 1 c 0 1 0 1 10 1−ω + g(x;f ,b ,b ,c ,c ), ℓ,j 0 1 0 1 10 X j=1 where ω ∈ [0,1], and g is the conditional PDF corresponding to the ensemble mem- bers. As we are working with wind speed data, g(x;f,b ,b ,c ,c ) is a gamma PDF with 0 1 0 1 mean b +b f and standard deviation c +c f. Here we restrict both the mean and the 0 1 0 1 standard deviation parameters to constant values for all ensemble members, which reduces the number of parameters and simplifies calculations. Mean parameters b , b are esti- 0 1 mated with the help of linear regression, while weight ω and standard deviation parameters c , c , by maximum likelihood method, using training data consisting of ensemble mem- 0 1 bers and verifying observations from the preceding n days (training period). In order to handle the problem that the wind speed values under 0.1 m/s are considered to be zero, the maximum likelihood (ML) method for gamma distributions suggested by Wilks (1990) is ap- plied, while the maximum of the likelihood function is found with the help of EM algorithm (McLachlan and Krishnan, 1997). For more details see Sloughter et al. (2010); Fraley et al. (2010). Once the estimated parameters for a given day are available, one can use either the mean or the median of the predictive PDF (3.1) as a point forecast. Based on a more careful look on the ensemble members there are some differences in the generation of the ten exchangeable ensemble members. To obtain them only five perturba- tions are calculated and then they are added to (odd numbered members) and subtracted from (even numbered members) the unperturbed initial conditions (Hora´nyi et al., 2011). Figure 2 shows the plume diagram of ensemble forecast of 10 meter wind speed for Debrecen initialized at 18 UTC, 22.10.2010. (solid line: control; dotted line: odd numbered members, dashed line: even numbered members). This diagram clearly illustrates that the behaviour of ensemble member groups {f , f , f , f , f } and {f , f , f , f , f } ℓ,1 ℓ,3 ℓ,5 ℓ,7 ℓ,9 ℓ,2 ℓ,4 ℓ,6 ℓ,8 ℓ,10 really differ from each other. Therefore, in this way one can also consider a model with three exchangeable groups: control, odd numbered exchangeable members and even numbered exchangeable members. This idea leads to the following PDF of the forecasted wind speed 5 Figure2: Plumediagramofensemble forecastof10meter windspeedforDebreceninitialized at 18 UTC, 22.10.2010. x: q(x|f ,f ,...,f ;b ,b ,c ,c ) = ω g(x;f ,b ,b ,c ,c ) (3.2) c ℓ,1 ℓ,10 0 1 0 1 c c 0 1 0 1 5 + ω g(x;f ,b ,b ,c ,c )+ω g(x;f ,b ,b ,c ,c ) , o ℓ,2j−1 0 1 0 1 e ℓ,2j 0 1 0 1 Xj=1 (cid:0) (cid:1) where for weights ω ,ω ,ω ∈ [0,1] we have ω + 5ω + 5ω = 1, while PDF g and c o e c o e parameters b ,b ,c ,c are the same as for the model (3.1). Obviously, both the weights 0 1 0 1 and the parameters can be estimated in the same way as before. As an illustration we consider the data and forecasts for Debrecen for two different dates 30.12.2010 and 17.03.2011 for models (3.1) and (3.2). Figures 3a and 3b show the PDFs of the two groups in model (3.1), the overall PDFs, the median forecasts, the verifying observations, the first and last deciles and the ensemble members. The same functions and quantities can be seen on Figures 3c and 3d, where besides the overall PDF we have three component PDFs and three groups of ensemble members. On 30.12.2010 the spread of the ensemble members is quite fair and the ensemble range contains the validating observation (3.2 m/s). In this case the ensemble mean (3.5697 m/s) overestimates, while BMA median forecasts corresponding to the two- and three-group models (3.2876 m/s and 3.2194 m/s, 6 5 2 5 0. 0.2 0 2 0 0. 0.2 ability0.15 ability0.15 b b Pro0.10 Pro0.10 0.05 0.05 0 0 0 0 0. 0. 0 2 4 6 8 10 0 2 4 6 8 10 12 Wind Speed Wind Speed (a) (b) 0 3 25 0. 0. 0 2 0. 20 ability0.15 ability0. b b Pro10 Pro0 0. 0.1 5 0 0. 0 0 0 0 0. 0. 0 2 4 6 8 10 0 2 4 6 8 10 12 Wind Speed Wind Speed (c) (d) Figure 3: Ensemble BMA PDFs (overall: thick black line; control: red line; sum of ex- changeable members on (a) and (b): light blue line; on (c) and (d): green (odd members) and blue (even members) lines), ensemble members (circles with the same colours as the corresponding PDFs), ensemble BMA median forecasts (vertical black line), verifying obser- vations (vertical orange line) and the first and last deciles (vertical dashed lines) for wind speed in Debrecen for models (3.1): (a) 30.12.2010, (b) 17.03.2011; and (3.2): (c) 30.12.2010, (d) 17.03.2011. respectively) are pretty close to the true wind speed. A different situation is illustrated on Figures 3b and 3d, where the spread of the ensemble is even higher, but all ensemble members underestimate the validating observation (6.1 m/s). Obviously, the same holds for the ensemble mean (3.2323 m/s) and due to the bias correction the BMA median forecasts corresponding to models (3.1) and (3.2) also give bad results (3.3409 m/s and 3.0849 m/s, respectively). To check the performance of probabilistic forecasts based on models (3.1) and (3.2) and the corresponding point forecasts, as a reference we use the ensemble mean and the ensemble 7 Average Width of BMA 67% Prediction Interval Average Width of BMA 90% Prediction Interval 4 6 2. 2 5 dth 62 dth 4. Wi 2. Wi e e g g 8 Avera 2.60 Avera 4.4 2.58 4.44 10 20 30 40 50 60 10 20 30 40 50 60 Days in Training Period Days in Training Period Coverage of BMA 67% Prediction Interval Coverage of BMA 90% Prediction Interval 5 1 85 0.9 6 0. 5 erage 675 erage 0.90 ov 0. ov 5 C C 9 8 5 0. 6 6 0. 85 8 0. 10 20 30 40 50 60 10 20 30 40 50 60 Days in Training Period Days in Training Period Figure 4: Average widths and coverages of 66.7% and 90% BMA prediction intervals corre- sponding to two-group model (3.1) for various training period lengths. median. We compare the mean absolute errors (MAE) and the root mean square errors (RMSE) of these point forecasts and also the mean continuous ranked probability scores (CRPS) (Wilks, 2006; Gneiting and Raftery, 2007) and the coverages and average widths of 66.7% and 90% prediction intervals of the BMA predictive probability distributions and of the raw ensemble. We remark that for MAE and RMSE the optimal point forecasts are the median and the mean, respectively (Gneiting, 2011; Pinson and Hagedorn, 2011). Further, given a cumulative distribution function (CDF) F(y) and a real number x, the CRPS is defined as ∞ 2 crps F,x := F(y)− dy. Z 1{y≥x} (cid:0) (cid:1) −∞(cid:0) (cid:1) The mean CRPS of a probability forecast is the average of the CRPS values of the predictive CDFs and corresponding validating observations taken over all locations and time points considered. For the raw ensemble the empirical CDF of the ensemble replaces the predictive CDF. The coverage of a (1−α)100%, α ∈ (0,1), prediction interval is the proportion of validating observations located between the lower and upper α/2 quantiles of the predictive distribution. For a calibrated predictive PDF this value should be around (1−α)100%. 8 CRPS of BMA 0 6 7 0. RS 50 CP 0.7 0 4 7 0. 10 20 30 40 50 60 Days in Training Period MAE of BMA Median Forecast 5 6 0 AE 1. M 0 5 0 1. 10 20 30 40 50 60 Days in Training Period RMSE of BMA Mean Forecast 0 0 4 1. MSE 385 R 1. 0 7 3 1. 10 20 30 40 50 60 Days in Training Period Figure 5: CRPS of BMA predictive distribution, MAE values of BMA median and RMSE values of BMA mean forecasts corresponding to two-group model (3.1) for various training period lengths. 4 Results Data analysis provided below was performed with the help of the ensembleBMA package of R (Fraley et al., 2009, 2011). As a first step the length of the appropriate training pe- riod was determined, then the performances of the BMA post-processed ensemble forecasts corresponding to models (3.1) and (3.2) were analyzed. 4.1 Training period According to the results of e.g. Raftery et al. (2005) to determine the length of the training periodtobeusedwe comparetheMAEvaluesofBMAmedian forecasts, theRMSEvalues of BMA mean forecasts, the CRPS values of BMA predictive distributions and the coverages and average widths of 90% and 66.7% BMA prediction intervals for training periods of length 10,11,...,60 calendar days. In order to ensure the comparability of the results we consider verification results from 02.12.2010 to 25.03.2011 (114 days). Consider first the two-group model (3.1). On Figure 4 the average widths and coverages 9 Average Width of BMA 67% Prediction Interval Average Width of BMA 90% Prediction Interval 0 5 4. 1 6 h 2. h dt dt Wi Wi 6 ge 9 ge 4.4 a 5 a er 2. er v v A A 2 57 4.4 2. 10 20 30 40 50 60 10 20 30 40 50 60 Days in Training Period Days in Training Period Coverage of BMA 67% Prediction Interval Coverage of BMA 90% Prediction Interval 0 685 0.91 0. 0 erage 0.675 erage 0.90 Cov 665 Cov 0.890 0. 0 655 0.88 0. 10 20 30 40 50 60 10 20 30 40 50 60 Days in Training Period Days in Training Period Figure 6: Average widths and coverages of 66.7% and 90% BMA prediction intervals corre- sponding to three-group model (3.2) for various training period lengths. of 66.7% and 90% BMA prediction intervals are plotted against the length of the training period. The average widths of the prediction intervals show an increasing trend, so shorter training periods yield sharper forecasts. Coverages of 66.7% and 90% prediction intervals are not monotonously increasing, too. For short training periods the coverage of the 66.7% prediction interval oscillates around the correct 66.7%, but for training periods not shorter than 17 days it stays above this level. The coverage of the 90% prediction interval stabilizes above the correct 90% for training periods longer than 24 days. Hence, to have calibrated forecasts, one should choose a training period not shorter than 25 days. Figure5showsCRPSvaluesofBMApredictivedistribution, MAEvaluesofBMAmedian forecastsandRMSEvalues ofBMAmeanforecasts asfunctionsofthetrainingperiodlength. CRPS and RMSE both take their minima at 28 days, the corresponding values are 0.7388 and 1.3675, respectively. MAE takes its minimum of 1.0472 at 30 days, while the second smallest value (1.0476) is obtained with a training period of length 28 days. This means that for model (3.1) a 28 days training period seems to be reasonable and training periods longer than 30 days cannot be taken into consideration. Similarconclusions canbedrawnfromFigures6and7forthethree-groupmodel(3.2). In thiscasethe66.7%and90%predictionintervalsareslightlynarrowerthanthecorresponding 10 CRPS of BMA 5 S 75 PR 0. C 0 4 7 0. 10 20 30 40 50 60 Days in Training Period MAE of BMA Median Forecast 5 7 0 1. AE 60 M 0 1. 5 4 0 1. 10 20 30 40 50 60 Days in Training Period RMSE of BMA Mean Forecast 1 4 1. MSE 1.39 R 7 3 1. 10 20 30 40 50 60 Days in Training Period Figure 7: CRPS of BMA predictive distribution, MAE values of BMA median and RMSE values of BMA mean forecasts corresponding to three-group model (3.2) for various training period lengths. intervals of model (3.1), their coverages stabilize above the correct 66.7% and 90% for training periods longer than 17 and 24 days, respectively. CRPS and MAE plotted on Figure 7 both reach their minima of 0.7372 and 1.0452, respectively, at 30 days, while values 0.7376 and 1.0456 corresponding to training period of length 28 days are both the fourth smallest ones. RMSE takes its minimum of 1.3632 at 27 days, and increases afterwards. The fourth smallest value (1.3644) again corresponds to 28 days, while the RMSE corresponding to 30 days is significantly larger (1.3664). Moreover, 66.7% and 90% prediction intervals corresponding to 28 days are sharper than the appropriate prediction intervals calculated using training period of length 30 days (2.5813 and 4.4340 vs. 2.5831 and 4.4378). Hence, we suggest the use of a training period of length 28 days for both BMA models. 4.2 Predictions using BMA post-processing According to the results of the previous subsection, to test the performance of BMA post- processing on the 11 member ALADIN-HUNEPS ensemble we use a training period of 28 calendar days. In this way ensemble members, validating observations and BMA models are

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.