University of TehranPhysical Geography Research2008-630X53320211023Satellite aerosol optical depth prediction using data mining of climate parametersSatellite aerosol optical depth prediction using data mining of climate parameters3193338256010.22059/jphgr.2021.318600.1007591FAMasoud SoleimaniPhD student, Department of Remote Sensing and GIS, Faculty of Geography, University of Tehran, Tehran, IranMeysam ArganyAssistant Professor, Department of Remote Sensing and GIS, Faculty of Geography, University of Tehran, Tehran, Iran0000-0001-6577-4443Ramin PapiPhD student, Department of Remote Sensing and GIS, Faculty of Geography, University of Tehran, Tehran, IranFatemeh AmiriMsc student, Department of Remote Sensing and GIS, Faculty of Humanities, Tarbiat Modares University, Tehran, IranJournal Article20210206<strong>Introduction </strong><br />Tropospheric aerosol particles play an important role in the Earth's radiative energy balance both directly by scattering and absorbing solar radiation and indirectly by modulating the microphysical and radiative properties of clouds. Aerosol optical depth (AOD) based on satellite remote sensing data is a quantitative estimate of the amount of aerosol in the atmosphere and can be used as an indicator of aerosol particle concentration. In general, the review of previous studies indicates the high importance of remote sensing aerosol products in modeling the spatial-temporal patterns of dust storms and in particular the identification of dust sources. One advantage of using satellite AOD for identifying dust events is that it can provide satisfactory results in arid areas with relatively little cloud cover. The presence of clouds in the sky also severely limits AOD terrestrial and satellite measurements. Thus, AOD datasets sometimes have a gap due to factors such as cloudiness. Since the possibility of monitoring and measuring aerosols in cloudy conditions is limited, the use of proxy datasets to fill the gap is also another advantage. In this regard, several studies based on the analysis of satellite data have emphasized the association between climatic parameters and dust events (specifically AOD) in different regions. Therefore, considering the relationship between climatic parameters and AOD, these parameters can be used as a proxy data set to estimate AOD values for areas without data or with cloud cover. Moreover, AOD values can be predicted using the predicted values of climatic parameters. Accordingly, in order to achieve reliable AOD prediction results, it is necessary to use a generalizable approach that can model the complex relationships between large data sets. For this purpose, an efficient data mining algorithm called M5P was adopted to analyze and extract the relationships between climatic parameters and AOD to obtain predictive models. The M5P algorithm is a combination of tree and regression models with capabilities such as high prediction accuracy and ease of result interpretation.<br /> <br /><strong>Materials and methods</strong><br />In this study, M5P data mining algorithm, which is based on tree structure and multivariate linear regression analysis, was used to derive AOD predictive models based on climatic parameters. Accordingly, a spatial database of remote sensing time series data related to four climatic parameters (as independent variables) including surface air temperature (SAT), precipitation (P), surface relative humidity (SRH) and wind speed (WS), and AOD (as dependent variable) was generated. WEKA[1] was used to implement the M5P model. After analyzing the relationships between independent and dependent variables through the tree model structure and linear multivariate regression, AOD predictive rules were extracted. Statistical indicators including Pearson Correlation Coefficient, Mean Absolute Error (MAE), and Root Mean Square Error (RMSE) were used to validate the linear predictive models. <br /><strong>Results and discussion</strong><br />After pre-processing the time series data of climatic parameters and AOD as training data set, the input independent and dependent variables of the M5P were defined. Implementation steps of the M5P algorithm were performed in WEKA, including homogenization of independent input data sets by forming decision-making trees based on a series of "if-then" rules, multivariate linear regression analysis in homogeneous classes, and finally validation of the model results. Thus, a total of four linear models (LM) or predictive rules for estimating AOD based on the values of climatic parameters were extracted. Finally, the AOD value can be estimated based on the thresholds defined by the M5P algorithm by placing the values of climatic parameters in the obtained linear models. The obtained linear models can predict AOD values in different conditions (based on climatic parameters). Validation of the results of the M5P algorithm was performed based on correlation analysis between input variables and the evaluation of prediction errors through MAE and RMSE statistics, which showed the acceptable performance and accuracy of linear models in AOD prediction. Given the dynamics of aerosol particles (especially dust) and their transportability by the wind even to very far distances from their source of emission, it is likely that the amount of AOD for a pixel, as measured by a satellite sensor, does not exactly belong to the same location on earth. Therefore, the prediction error of the models may be due to the transportability of the aerosol particles. This may be a reason for possible discrepancies, especially considering the strong correlation between AOD and climatic parameters. Because a dust storm arising from a source may have no relation with the values of the climatic parameters at the destination.<br /><strong>Conclusion</strong><br />Aerosol optical depth (AOD), as an indicator of the state of the atmospheric aerosol, is of great importance for studies on dust storms. Access to AOD data is restricted in some parts of the world and some seasons due to limitations such as cloud cover. On the other hand, it is important to be aware of future spatial-temporal patterns of dust storms in order to adopt crisis management measures. <br />This study evaluated the capability of M5P data mining algorithm in AOD prediction based on climatic parameters. Here, four linear predictive models were extracted based on inductive learning and a set of "if-then" rules. Predictive models were extracted and validated using a remote sensing time series dataset for Ahvaz, Iran. Using the obtained predictor linear models in this study, it is possible to make an acceptable estimation of AOD in areas with restrictions on access to AOD. Furthermore, it is possible to estimate the future spatial-temporal patterns of AOD using the predicted values of climatic parameters.<br />Dust storms generally occur as a function of a wide range of environmental conditions, including atmospheric properties and surface parameters such as vegetation, soil moisture, and soil texture. With this background, merely considering the atmospheric conditions and their impacts on the spatial-temporal patterns of AOD may fail to produce the desired results. Therefore, future AOD modeling studies are recommended to use ground surface parameters in addition to climatic parameters, which are mostly indicators of the atmospheric condition. This can increase the accuracy of linear models for predicting AOD.<br /> <br />[1] Waikato Environment for Knowledge Analysis<strong>Introduction </strong><br />Tropospheric aerosol particles play an important role in the Earth's radiative energy balance both directly by scattering and absorbing solar radiation and indirectly by modulating the microphysical and radiative properties of clouds. Aerosol optical depth (AOD) based on satellite remote sensing data is a quantitative estimate of the amount of aerosol in the atmosphere and can be used as an indicator of aerosol particle concentration. In general, the review of previous studies indicates the high importance of remote sensing aerosol products in modeling the spatial-temporal patterns of dust storms and in particular the identification of dust sources. One advantage of using satellite AOD for identifying dust events is that it can provide satisfactory results in arid areas with relatively little cloud cover. The presence of clouds in the sky also severely limits AOD terrestrial and satellite measurements. Thus, AOD datasets sometimes have a gap due to factors such as cloudiness. Since the possibility of monitoring and measuring aerosols in cloudy conditions is limited, the use of proxy datasets to fill the gap is also another advantage. In this regard, several studies based on the analysis of satellite data have emphasized the association between climatic parameters and dust events (specifically AOD) in different regions. Therefore, considering the relationship between climatic parameters and AOD, these parameters can be used as a proxy data set to estimate AOD values for areas without data or with cloud cover. Moreover, AOD values can be predicted using the predicted values of climatic parameters. Accordingly, in order to achieve reliable AOD prediction results, it is necessary to use a generalizable approach that can model the complex relationships between large data sets. For this purpose, an efficient data mining algorithm called M5P was adopted to analyze and extract the relationships between climatic parameters and AOD to obtain predictive models. The M5P algorithm is a combination of tree and regression models with capabilities such as high prediction accuracy and ease of result interpretation.<br /> <br /><strong>Materials and methods</strong><br />In this study, M5P data mining algorithm, which is based on tree structure and multivariate linear regression analysis, was used to derive AOD predictive models based on climatic parameters. Accordingly, a spatial database of remote sensing time series data related to four climatic parameters (as independent variables) including surface air temperature (SAT), precipitation (P), surface relative humidity (SRH) and wind speed (WS), and AOD (as dependent variable) was generated. WEKA[1] was used to implement the M5P model. After analyzing the relationships between independent and dependent variables through the tree model structure and linear multivariate regression, AOD predictive rules were extracted. Statistical indicators including Pearson Correlation Coefficient, Mean Absolute Error (MAE), and Root Mean Square Error (RMSE) were used to validate the linear predictive models. <br /><strong>Results and discussion</strong><br />After pre-processing the time series data of climatic parameters and AOD as training data set, the input independent and dependent variables of the M5P were defined. Implementation steps of the M5P algorithm were performed in WEKA, including homogenization of independent input data sets by forming decision-making trees based on a series of "if-then" rules, multivariate linear regression analysis in homogeneous classes, and finally validation of the model results. Thus, a total of four linear models (LM) or predictive rules for estimating AOD based on the values of climatic parameters were extracted. Finally, the AOD value can be estimated based on the thresholds defined by the M5P algorithm by placing the values of climatic parameters in the obtained linear models. The obtained linear models can predict AOD values in different conditions (based on climatic parameters). Validation of the results of the M5P algorithm was performed based on correlation analysis between input variables and the evaluation of prediction errors through MAE and RMSE statistics, which showed the acceptable performance and accuracy of linear models in AOD prediction. Given the dynamics of aerosol particles (especially dust) and their transportability by the wind even to very far distances from their source of emission, it is likely that the amount of AOD for a pixel, as measured by a satellite sensor, does not exactly belong to the same location on earth. Therefore, the prediction error of the models may be due to the transportability of the aerosol particles. This may be a reason for possible discrepancies, especially considering the strong correlation between AOD and climatic parameters. Because a dust storm arising from a source may have no relation with the values of the climatic parameters at the destination.<br /><strong>Conclusion</strong><br />Aerosol optical depth (AOD), as an indicator of the state of the atmospheric aerosol, is of great importance for studies on dust storms. Access to AOD data is restricted in some parts of the world and some seasons due to limitations such as cloud cover. On the other hand, it is important to be aware of future spatial-temporal patterns of dust storms in order to adopt crisis management measures. <br />This study evaluated the capability of M5P data mining algorithm in AOD prediction based on climatic parameters. Here, four linear predictive models were extracted based on inductive learning and a set of "if-then" rules. Predictive models were extracted and validated using a remote sensing time series dataset for Ahvaz, Iran. Using the obtained predictor linear models in this study, it is possible to make an acceptable estimation of AOD in areas with restrictions on access to AOD. Furthermore, it is possible to estimate the future spatial-temporal patterns of AOD using the predicted values of climatic parameters.<br />Dust storms generally occur as a function of a wide range of environmental conditions, including atmospheric properties and surface parameters such as vegetation, soil moisture, and soil texture. With this background, merely considering the atmospheric conditions and their impacts on the spatial-temporal patterns of AOD may fail to produce the desired results. Therefore, future AOD modeling studies are recommended to use ground surface parameters in addition to climatic parameters, which are mostly indicators of the atmospheric condition. This can increase the accuracy of linear models for predicting AOD.<br /> <br />[1] Waikato Environment for Knowledge Analysishttps://jphgr.ut.ac.ir/article_82560_8472539da61aeb4de0756d5cd5ffdaa3.pdf