Comparison of the Results of Multilayer Perceptron Neural Networks and Multiple Linear Regressions for Prediction of Ozone Concentration in Tabriz City



Due to the health effects caused by airborne pollutants in urban areas, forecasting of air quality parameters is one of the most important topics of air quality research. Many works have been carried out to determine the factors which control air pollution concentrations in order to enable the development of tools to aid in the forecasting of pollutant concentrations. One approach to predict future concentrations is to use a detailed atmospheric diffusion model. Such models aim to resolve the underlying physical and chemical equations controlling pollutant concentrations and therefore require detailed emissions data and meteorological fields. The second approach is to devise statistical models which attempt to determine the underlying relationship between a set of input data (predictors) and targets (predictand). Regression modeling is an example of such a statistical approach and has been applied to air quality modeling. Artificial neural networks (ANNs) can model non-linear systems and have been used with some success to model air pollution concentrations. Tabriz is the most industrialized and populated city in the northwest of Iran and the second polluted city of the country. Location of industrial centers in the west and southwest directions of Tabriz city and blowing winds from those directions, in winter season causes the transfer of pollution to inner Tabriz.

Materials and methods
Based on the data from Department of Environment of East Azarbaijan province, 60 percent of air pollution concentration is referred to industrial centers located in west and southwest directions. ANN models are computer programs that are designed to emulate human information processing capabilities such as knowledge processing, speech, prediction, classifications, pattern recognition, and control. The ability of ANN systems to spontaneously learn from examples, “reason” over inexact and fuzzy data, and to provide adequate and rapid responses to new information not previously stored in memory has generated increasing acceptance for this technology in various engineering fields and, when applied, has demonstrated remarkable success. The major building block for any ANN architecture is the processing element or neuron. These neurons are located in one of the three types of layers: the input layer, the hidden layer, and the output layer. First the input neurons receive data from the outside environment. Then the hidden neurons receive signals from all of the neurons in the preceding layer. Finally and the output neurons send information back to the external environment. In this paper, the artificial neural network (ANN) and multiple linear regressions (MLR) have been applied for short-term prediction of ozone in the Tabriz metropolis. MLP is capable of modeling highly non-linear relationship and can be trained to accurately generalize when presented with new, unseen data. MLP learns to model a relationship during a supervised training procedure, when they are repeatedly presented with series of input and associated output data. The MLP has the ability to learn through training. Training requires a set of training data; which consists of a series of input and associated output vectors. During training the MLP repeatedly presented with the training data and the weights in the network are adjusted until the desired input-output mapping is achieved. MLP is a supervised procedure. During training, output from the MLP for a given input vector, may not equal to the desired output. An error signal is defined as the difference between the desired and actual output. Training uses the magnitude of this error signal to determine to what degree the weight in the network should be adjusted so that the overall error of the MLP is reduced.

Results and discussion
The objective of this work was developing a model that could make accurate short-term (hourly) predictions, and since the relationship between O3 and meteorology is complex and extremely non-linear, ANNs were used to model and predict hourly O3 concentrations from readily observable local meteorological data. The architecture of such a net is established as follows the numbers of neurons in the input and the output layers are determined by the dimension of the input and the output vector, respectively, while the number of the hidden layers and/or the number of neurons in each hidden layer depends on the kind of the modeled system and should be optimized. Designing of the network architecture is based on the approximation theory of Kolmogorov. The results show that the ANN is more suitable model for the prediction of ozone concentration and that, the R2 in ANN and MLR models are 94% and 51%, respectively.

Fluctuations of the Tabriz hourly O3 concentrations for the period of October 2003 were studied. It was found that ANN to be useful tool for the short-term prediction of O3 concentrations. The optimum structure of ANN was determined by obtaining a minimum TRMS for test set. It was found that the structure of ANN with 35 neurons in the hidden layer had the best performance. It has also been demonstrated that MLP neural networks offer several advantages over traditional MLR models. This work has shown that MLP neural networks can accurately model the relationship between local meteorological data and O3 concentrations in an urban environment.