Performance assessment of Kernel-Based methods in estimation of suspended sediment loads (Case study: Maragheh SofiChay River)

Document Type : Full length article

Authors

1 Young Researchers and Elite Club, Maragheh Branch, Islamic Azad University, Maragheh, Iran

2 Faculty Member, Department of Water Engineering, Agriculture Faculty, University of Tabriz, Iran

Abstract

Introduction
Because of the importance of sediment transport in the efficient use of water resources and dam designing, estimation of the sediment load in rivers has been an essential interest to the engineers from long time ago. This leads to the design of various methods such as different empirical equations for solving the sediment transport. The error in most traditional experimental methods is common due to complexities in how this process works out and the vast amount of factors that cause this phenomenon. Therefore, achieving a suitable method that can accurately estimate the amount of sediment is very essential. In this study, the suspended sediment loads of SofiChay river have been estimated by modern data mining methods including Gaussian process and support vector machines that use the kernel functions that have a high ability to solve nonlinear problems. The results that obtained were compared with experimental methods such as sediment rating curve and seasonal method.
 
Materials and Methods
The study area
Sofi Chay catchment is up to 311 Km3 in area and has been located in the south part of East Azerbaijan province and the northern city of Maragheh. Sofi Chay river is located within the geographical coordinates 37⁰ '15 "2 to 37⁰ '45 3'' north latitude and 45⁰ '56 " 31 to 46⁰ '25 "5 east longitude.
Gaussian process regression
Gaussian processes are a fruitful way of defining prior distributions for flexible regression and classification models in which the regression or class probability functions are not limited to simple parametric forms. One attraction of Gaussian processes is the variety of covariance functions one can choose from. These lead to functions with different degrees of smoothness or different sorts of additive structures. When such a function defines the average response in a regression model with Gaussian errors, we can use matrix calculations to deduce that it is possible for data sets with more than a thousand samples. Gaussian processes in statistical modeling are very important because they are normal characteristics. Gaussian processes and related methods have been used in various contexts for many years. Despite this past usage, and despite the fundamental simplicity of the idea, Gaussian process models appear to have been little appreciated by most Bayesians. I speculate that this could be partly due to confusion between the properties one expects of the true function being modeled and those of the best predictor for this unknown function.
 
Support Vector Regression (SVR)
SVRs are a subset of SVMs that are particular learning systems that use a linear high dimensional hypothesis space called feature space. These systems are trained using a learning algorithm based on optimization theory. This method was introduced by Vapnik in 1995. SVMs have been employed for regression estimation, so called support vector regression (SVR), in which the real value functions are estimated. In this case, the aim of learning process is to find a function f(x) as an approximation of the value y(x) with minimum risk, and only based on the available independent and identically distributed data. Often in complex nonlinear problems, the original input space (predictor variable) is non-linearly related to the predicted variable (lateral spread displacement).
Results and Discussions
In this study, after required data related to the Sofi ChayRiver were collected, these data were examined by the standard normal homogeneity tests such as the Buishand range, Pettitt and Von Neumann ratio; and after refining the data, the drawing of sediment rating curve was developed. Then, the amount of sediment discharge of the river Sofi Chay was estimated using Gaussian process regression, support vector regression, sediment rating curve and seasonal methods. To achieve optimum results by the used data mining techniques, various scenarios including different types of kernel functions and different intervals of hyper parameters of kernel functions were defined. When Gaussian process regression, along with radial basis function kernel (Gaussian noise (ɛ) equal to 0.01 and gamma (Υ) equal to 0.5), were used to estimate sediment discharge rate of the river Sofi Chay, it was observed that this method by presenting statistical indicators (correlation coefficient (R) equal to 0.977, Nash-Sutcliffe coefficient (NS) equal to 0.794, mean absolute error (MAE) equal to 77.4278 (tons/day) and root mean square error (RMSE) equal to 698.7455 (tons/day)), have the highest accuracy and lowest error among the methods investigated in this study. Also the both investigated data mining methods have far greater efficiency and accuracy in this area.
 
Conclusion
In this study, the amount of suspended sediment load was estimated using traditional methods such as sediment rating curve and seasonal method in comparison with modern data mining methods based on kernel functions such as Gaussian process regression and support vector regression. The results indicated that seasonal method has better performance in this case rather than sediment rating curve. The comprehensive results show that both modern data mining methods examined in this study outperform rather than traditional methods. Among the Gaussian process regression and support vector regression results, we observed the higher ability of Gaussian process regression method with using radial basis function as a kernel function. Generally, use of Gaussian process regression method suggested in similar cases.

Keywords

Main Subjects


اسکندری، ع.؛ نوری، ر.؛ معراجی، ح. و کیاقادری، ا. (1391). توسعة مدلی مناسب بر مبنای شبکة عصبی مصنوعی و ماشین بردار پشتیبان برای پیش‏بینی بهنگام اکسیژن‏خواهی بیوشیمیایی ۵روزه، مجلة محیطشناسی، 61(1): 71 ـ 82.
امامی، س.ا. (1379). انتقالرسوب، تهران: انتشارات جهاد دانشگاهی صنعتی امیرکبیر.
خزائی‌پول، ا. و طالبی، ع. (1392). بررسی امکان پیش‏بینی رسوبات معلق با استفاده از ترکیب منحنی سنجة رسوب و شبکة عصبی مصنوعی (مطالعة موردی: رودخانة قطورچای، پل یزدکان)، مجلة پژوهشهایفرسایشمحیطی، 9: 73 ـ 82.
دهقانی، ا.ا.؛ زنگانه، م.ا.؛ مساعدی، ا. و کوهستانی، ن. (1388). مقایسة تخمین بار معلق به دو روش منحنی سنجة رسوب و شبکة عصبی مصنوعی، مجلة علومکشاورزیومنابعطبیعی، 16، 36 ـ 51.
دهقانی، ن. و وفاخواه، م. (1392). مقایسة روش‌های تخمین رسوب معلق روزانه با استفاده از روش‌های منحنی سنجة رسوب و شبکة عصبی (مطالعة موردی: ایستگاه قزاقلی، استان گلستان)، مجلة پژوهشهایحفاظتآبوخاک، 20(2): 221 ـ 230.
رجبی، م.؛ فیض‌الله‌پور، م. و روستایی، ش. (1394). استفاده از مدل تبرید تدریجی عصبی (NDE) در تخمین بار معلق رسوبی و مقایسة آن با مدل ANFIS  و RBF (مطالعة موردی: رودخانة گیوی‌چای)، مجلة توسعهوجغرافیا، 39: 1 ـ 16.
رضازاده جودی، ع. و ستاری، م. (1394). تخمین عمق چالة آبشستگی پایة پل در سازه‏های رودخانه‏ای با روش رگرسیون فرایند گاوسی، نشریة تحقیقاتکاربردیمهندسیسازه‏‏هایآبیاریوزهکشی، 16(65): 19 ـ 36.
شهرابی، ج. و ذوالقدر شجاعی، ع. (1390). دادهکاویپیشرفته (مفاهیموالگوریتم‌ها)، انتشارات جهاد دانشگاهی واحد صنعتی امیرکبیر.
طباطبایی، م.؛ سلیمانی، ک.؛ حبیب‌الله روشن، م. و کاویان، ع. (1393). برآورد غلظت رسوب معلق روزانه با استفاده از شبکة عصبی مصنوعی و خوشه‏بندی داده‏ها به روش نگاشت خودسازمانده (مطالعة موردی: ایستگاه هیدرومتری سیرا ـ رودخانة کرج)، پژوهشنامةمدیریتحوضةآبریز، 10: 98 ـ 116.
فلامکی، ا.؛ اسکندری، م.؛ بغلانی، ع. و احمدی، س.ا. (1392). مدل سازی بار رسوب کل رودخانه‏ها با استفاده از شبکه‏های عصبی مصنوعی، نشریة حفاظتمنابعآبوخاک، 2(3): 13 ـ 25.
مساعدی، ا.؛ زنگانه، م.ا.؛ مفتاح، م.؛ دهقانی، ا.ا. و خوشروش، م. (1388). ارزیابی روش‏های هیدرولوژیکی برآورد بار معلق (مطالعة موردی: رودخانة اترک استان گلستان)، دهمینسمینارسراسریآبیاریوکاهشتبخیر۱۹تا۲۱بهمن، کرمان.
نجمایی، م. (1369). هیدرولوژیمهندسی، چ 2، دانشگاه علم و صنعت ایران.
ولی، ع.؛ معیری، م.؛ رامشت، م.ح. و موحدی‌نیا، ن. (1390). تحلیل مقایسة عملکرد شبکه‏های عصبی مصنوعی و مدل‏های رگرسیونی پیش‏بینی رسوب معلق (مطالعة موردی: حوضة آبخیز اسکندری واقع در حوضة آبریز زاینده‏رود)، پژوهشهایجغرافیایطبیعی، 71(1): 21 ـ 30.
 
Alp, M. and Cigizoglu, H.K. (2007). Suspendedsediment load simulation by two artificial neuralnetwork methods using hydro meteorological data, Environmental Modelling and Software, 22: 2-13.
Dehgani, A.A.; Zangane, M.A.; Mosaedi, A. and Kuhestani N. (2010). Estimation of suspended sediment in the sediment rating curve and artificial neural network, Journal of Agricultural Sciences and Natural Resources, 16: 36-51 (In Persian).
Dehgani, N. and Vafakhah, M. (2014). Comparison of methods for prediction of daily sediment using sediment rating curves and neural network (Case Study: Gezagli Station, Golestan Province), Journal of Soil and Water Conservation, 20(2): 221-230 (In Persian).
Duan, W.L.; He, B.; Takara, K.; Luo, P.P.;  Nover, D. and Hu, M.C. (2015). Modeling suspended sediment sources and transport in the Ishikari River basin, Japan, using SPARROW, Hydraulic Earth Systems Sciences, 19: 1293-1306.
Ebden, M. (2008). Gaussian processes for regression: a quick Introduction, Available from: http://www.robots.ox.ac.uk/~mebden/reports/GPtutorial.pdf [Accessed 14 August 2015]
Eder, A.P.; Strauss, T.; Krueger, B.; 1and, J.N. and Quinton, B. (2010). A Comparative calculation of suspended sediment loads with respect to hysteresis effects (in the Petzenkirchen catchment), Austria, Journal of Hydrology, 389: 168-176.
Emami, S.A. (2000). Sediment transportation, Jahade daneshgahi Press, Amirkabir industrial university, Tehran, First edition, 716p. (In Persian).
Eskandari, A.; Nouri, R.; Meraji, H. and Kiagaderi, A. (2012). The development of an appropriate model based on artificial neural network and support vector machine for predicting biochemical oxygen during 5 days, Journal of Ecology, 61(1): 71-82 (In Persian).
Falamaki, A.; Eskandari, M.; Baghlani, A. and Ahmadi, S.A. (2013). Modeling total sediment load in rivers using artificial neural networks, Journal of water and soil conservation, 2(3): 13-25 (In Persian).
Heng, S. and Suetsugi, T. (2013). Using artificial neural network to estimate sediment load in ungauged catchments of the Tonle Sap River Basin, Cambodia, Journal of Water Resource and Protection, 5: 111-123.
Kakaei Lafdani, E.; Moghaddam Nia, A. and Ahmadi, A. (2013). Daily suspended sediment load prediction using artificial neural networks and support vector machines, Hydrology, 478: 50-62.
Kao, Sh.; Lee, T. and Milliman, J.D. (2005). Calculating highly fluctuated suspended sediment fluxes from mountainous rivers in Taiwan, TAO, 16(3): 653-675.
Khazaie Poul, A. and Talebi, A. (2013). Investigation of Possibility of Suspended Sediment Prediction Using The Combination of Sediment Rating Curve and Artificial Neural Network (Case Study: Ghatorchai River, Yazdakan Bridge), Quarterly Journal of Environmental Erosion Researches, 2(9): 73-82 (In Persian).
Kia, E.; Emadi, A.R. and Fazlola, R. (2013). Investigation and Evaluation of Artificial Neural Networks in Babolroud River Suspended Load Estimation, Journal of Civil Engineering and Urbanism, 3(4): 183-190.
Kumar Goyal, M. (2014). Modeling of Sediment Yield Prediction Using M5 Model Tree Algorithm and Wavelet Regression, Journal of Water Resources Management, 28: 1991-2003.
Mosaedi, A.; Zangane, M.A.; Meftah, M.; Dehgani, A.A. and Khoshravesh, M. (2010). Evaluation of hydrological methods to estimate the suspended load (case study: Atrak River of Golestan Province), 10th Seminar irrigation and evaporation, 8-10 February, Kerman (In Persian).
Najmaei, M. (1990). Engineering Hydrology, Second edition, Iran University of Science and Technology (In Persian).
Neal, R.M. (1997). Monte carlo implementation of gaussian process models for bayesian regression and classification, University of Toronto, Toronto: Department of Statistics and Department of Computer Science, Technical report no, 9702.
Onderka, M.; Krein, A. and Wrede, S. (2012). Dynamics of storm-driven suspended sediments in headwater catchment described by multivariable modeling, Journal of Soils Sediments, 12: 620-635.
Pal, M. and Deswal, S. (2010). Modelling pile capacity using Gaussian process regression, Computers and Geotechnics, 37, 942-947.
Rajabi, M.; Feizollahpour, M. and Roustaie, S. (2015). Using NDE model for estimation of suspended sediment load in comparison with ANFIS and RBF case study: Givi Chay, Geography and Development Iranian Journal, 39(2): 1-16 (In Persian).
Rajaee, T.; Mirbagheri, S.A.; Nourani, V. and Alikhani, A. (2010). Prediction of daily suspended sediment load using wavelet and neuro fuzzy combined model, International Journal of Environment Sciences, Tech., 7(1): 93-110,
Rezazadeh Joudi, A. and Sattari, M. (2016). Estimation of Scour Depth of Piers in Hydraulic Structures using Gaussian Process Regression, Applied Research in Irrigation and Drainage Structures Engineering, 16(65): 19-36 (In Persian).
Sadeghi, S.H.R.; Mizuyama, T.; Miyata, S.; Gomi, T.; Kosugi, K.; Fukushima, T.; Mizugaki, S. and Onda, Y. (2008). Development, evaluation and interpretation of sediment rating curves for a Japanese small mountainous reforested watershed, Geoderma, 144: 198-211.
Shahrabi, J. and Hejazi, T.H. (2011). Data mining, Tehran, Industrial University of Amirkabir, Jahad daneshgahi Press (In Persian).
Tabatabaei, M.; Solaimani, K.; Habibnejad Roshan, M. and Kavian, A. (2014). Estimation of Daily Suspended Sediment Concentration using Artificial Neural Networks and Data Clustering by Self-Organizing Map (Case Study: Sierra Hydrometry Station- Karaj Dam Watershed), Journal of Watershed Management Research, 5(10): 98-116 (In Persian).
Vali, A.; Moayeri, M.; Ramsht, M.H. and Movahedinia, N. (2010). Analysis and Comparison of artificial neural networks and regression models in suspended sediment Prediction case study: Eskandari Catchment Area located in Zayanderood Basin, Journal of Physical Geography Research Quarterly, 71(1): 21-30 (In Persian).
Vapnik, V.N. (1995). The nature of statistical learning theory, Newyork: springer-verlag.