Pretreatment of Environmental Data for Forecasting Purposes

Rueppel, Uwe; Goebel, Peter

Konferenzbeitrag

Pretreatment of Environmental Data for Forecasting Purposes

Dokumententyp

Text/Conference Paper

Datum

2008

Autor:innen

Rueppel, Uwe

Goebel, Peter

Quelle

Environmental Informatics and Industrial Ecology

Tools and Databases

Verlag

Shaker Verlag

Zusammenfassung

To assess present actions on the environment, it is necessary to estimate its impact in the future. Niels Bohr2 recognized, “prediction is very difficult, especially about the future”. Fortunately, “the future is made of the same stuff as the present” (Simone Weil3). This holds the fundamental possibility to forecast. The present is described with data. To draw the right conclusion about the future, the data need to be significant, correct and complete. This work is part of a project for an active control of groundwater levels. In this project, expected groundwater levels are being prognosticated according to varying infiltration masses using Artificial Neural Networks (ANN). Thus, an adequate infiltration quantity will be identified in order to reach the desired groundwater level. Before the environmental data are suitable for the actual forecast purpose, they need to undergo a wide range of pretreatments. These efforts are being described within this paper. In a first step, substitution methods will be presented to impute missing data. Basically, these methods can be divided into two branches. One category with correlation and kriging methods which use related measuring data sets, i.e. data sets of a nearby measuring station for e.g. groundwater level, temperature, rainfall etc.. The other category that uses only the one regarded data set consists of mere statistical methods that are spline interpolation, time-series forecasts and multiple imputations. In a second step, the completed data sets need to be freed from gross errors. For that reason different test criteria like bound checking, comparisons of spacings and different statistical methods are implemented. Furthermore, the original dynamic and time-variant data sets are compared with computed data sets, generated with time series analysis models. Outliers are indicated if computed values strongly diverge from original values. In doubtful situations the current curve can be compared with a curve of a correlative data set, if available. In a third step, in terms of a complexity reduction, the number of the relevant data that serve as input parameters for the ANN need to be reduced without losing the necessary information to make predictions. This is important because in the present case the number of necessary input parameters is too high in comparison to the number of training sets to train the ANN. Different statistical approaches will be discussed, like moving averages, time-weighted transformations and a method to combine sets of moving averages to reduce the number of input parameters of the ANN with consistent information content.

Rueppel, Uwe; Goebel, Peter (2008): Pretreatment of Environmental Data for Forecasting Purposes. Environmental Informatics and Industrial Ecology. Aachen: Shaker Verlag. Tools and Databases. Lüneburg. 2008

Sammlungen

Environmental Informatics 2008

Komplettanzeige

Pretreatment of Environmental Data for Forecasting Purposes

Volltext URI

Dokumententyp

Zusatzinformation

Datum

Autor:innen

Zeitschriftentitel

ISSN der Zeitschrift

Bandtitel

Quelle

Verlag

Zusammenfassung

Beschreibung

Schlagwörter

Zitierform

DOI

Tags

Sammlungen