Weather models – Big Data for good forecasts

Data sets of weather models from different meteorological institutes are an important basis for forecasting the performance of wind power and PV systems. These weather models, also called NWP (Numerical weather prediction), use numerical equations to calculate the future weather situation. This means that it is – in some cases – possible to predict various meteorological parameters for the coming days and weeks for locations all over the world.



In doing so, the producers of these weather forecasts draw on a variety of data to determine a starting point for the weather model. This data pool consists partly of measurements from satellites, radar stations and ground-based facilities (weather stations); however, data from shipping, air traffic or weather balloons etc. is also used.


In the next step, which is also referred to as data assimilation, the data is then used to calculate a starting point, which the weather model can now use to predict the future.


The figure below shows the temperature for a time step based on the forecasts of one of our weather models, the GFS (Global Forecast System).


Multiple weather models


enercast obtains a variety of different weather models to meet different application scenarios. An important application is global weather models, which enable us to track worldwide developments. We also provide weather models that calculate weather forecasts with a higher degree of accuracy for a specific geographical area. There are also weather models for highly specialised applications such as maritime shipping.


An excerpt from the portfolio of the DWD (Deutscher Wetterdienst, German weather service) illustrates this principle:


The ICON global model calculates the development of weather over a week ahead and is updated several times a day. With COSMO-EU, the DWD has a model for the European region, which calculates the development over the next 3 days with a higher resolution. COSMO-DE provides weather forecasts with an even higher resolution for German-speaking countries. This model is recalculated more frequently over the course of a day than for the other models. For economic reasons, it also has a forecast range of approximately one day.


Large data sets


The weather models are typically calculated on large networks of computers known as mainframes or super computers. These are warehouse-sized computer systems with very high processing power.


To give an idea of the size of the data sets, one of our weather forecast suppliers, the ECMWF (European Centre for Medium-Range Weather Forecasts), has supplied the following facts and figures:


The super computers have tens of thousands of processing cores and can process the huge quantities of data gathered by the weather models. The archived data is increasing by 130 TB (terabytes) per day, and the total data available stands at around 125 PB (petabytes).


The weather parameters that are calculated by a weather model range from standard values, such as temperature, air pressure and wind speed, through to parameters such as exposure to radiation, air pollution or pollen count.


Efficient data delivery is key


enercast obtains various types of meteorological data to meet the diverse requirements of our customers. This data includes specific weather parameters of different weather models. But other data sources such as satellite data or input time series transmitted in real time are also incorporated in our algorithms.


However, we need to access several years of long-term, meteorological records for our business processes. Not only do we need to store these, but they also have to available for quick access.


For example, in order to perform analyses using these data sets, they must be organised in a certain way. enercast uses standard data storage formats from the area of big data to do this.


Progress through intelligent technologies


enercast currently uses advanced and complex procedures to create a power forecast using meteorological data.


The ANN (Artificial Neural Network) is one of the procedures used by enercast to create a power forecast for plants. This procedure mimics the biological model of a brain. The neural networks learn the connection between the performance data and the weather forecasts for a particular plant. The process is also called “training”. After the learning process, these trained neural networks can be used for future predictions because they “remember” what they learned. The ANN acquires a lot of information about the plant during the training. This includes the distinct characteristics of each plant or influences resulting from the environment of the site, for example.


The more data that is available to the ANN during the training, the more information and different scenarios it can learn.


Physical models are another method used by enercast to create power predictions. These physically emulate the various plant types and calculate the power forecasts based on the weather models.


Ensembling promises reliable power forecasts


Each weather model has strengths and weaknesses, which means weather forecasts vary depending on the weather models. This behaviour is partly reflected in the power forecasts produced. The so-called “ensemble method” is used to counteract this effect and improve the predictive quality. In this procedure, an assessment of each weather model is performed individually for each plant to highlight the strengths and minimise the weaknesses of individual weather models.