HOME: Online Issues

Development of Regression Models for Predicting Water Quality Index Based on Dissolved Oxygen for River Pollution Assessment

E-mail Print PDF

mac2025

Wan Mohamad Haziq Wan Roselan

Faculty of Electrical Engineering, Universiti Teknologi MARA Cawangan Pulau Pinang, Malaysia

Muhamad Irfan Ahmad Suhkri

Faculty of Electrical Engineering, Universiti Teknologi MARA Cawangan Pulau Pinang, Malaysia

Mohamad Faizal Abd Rahman

Faculty of Electrical Engineering, Universiti Teknologi MARA Cawangan Pulau Pinang, Malaysia

Moheddin Usodan Sumagayan

Mindanao State University - Iligan Institute of Technology Andres Bonifacio Ave, Iligan City, 9200 Lanao del Norte, Philippines

Mohd Suhaimi Sulaiman

Faculty of Electrical Engineering, Universiti Teknologi MARA Cawangan Pulau Pinang, Malaysia

 

 

Abstract

Water is an essential resource in Malaysia, playing a crucial role in sustaining human life, agriculture, and industry. However, rapid industrialization, urbanization, and development have significantly deteriorated river water quality, posing serious environmental and public health risks. Traditional water quality monitoring methods rely on manual sampling and laboratory analysis and are often time-consuming, labor-intensive, and inefficient. This study aims to overcome these challenges by developing regression-based predictive models to estimate the Water Quality Index (WQI) based on Dissolved Oxygen (DO) measurements. The research utilizes a dataset of 219 river water samples collected between June and November 2023 from the Kaggle database. Statistical validation techniques were applied to assess data distribution and accuracy, including normality tests and error bar plots. Multiple regression techniques were implemented using MATLAB and Python to determine the most effective model. MATLAB’s Linear Regression model demonstrated superior performance among the tested approaches, achieving an R² value of 0.95397 and a Root Mean Square Error (RMSE) of 7.2728. These results highlight the potential of regression models in providing a fast, reliable, and cost-effective method for water quality assessment. By leveraging these predictive techniques, environmental authorities and policymakers can implement timely interventions, ensuring better management and protection of freshwater ecosystems in Malaysia.

pdf

Keyword: Water Quality Index (WQI), Dissolved Oxygen (DO), Regression Modelling, Machine Learning, Water Quality Assessment, River Pollution

DOI: 10.24191/esteem.v21iMarch.5023.g3037

References:

[1] N. A. S. Abdullah et al., "Water Quality Assessment of Tekala River Selangor," in Proc. Int. Conf. Environ. Sci. Technol., 2017, pp. 101–106

[2] Z. S. Khozani, M. Iranmehr, and W. Mohtar, "Improving Water Quality Index prediction for water resources management plans in Malaysia: application of machine learning techniques," Geocarto Int., vol. 37, no. 25, pp. 10058–10075, Dec. 2022. Available: https:// doi.org/10.1080/10106049.2022.

[3] R. Afroz, M. M. Masud, R. Akhtar, and J. B. Duasa, "Water pollution: Challenges and future direction for water resource management policies in Malaysia," Environ. Urban. Asia, vol. 5, no. 1, pp. 63–81, Mar. 2014. Available: https:// doi.org/10.1177/0975425314521544

[4] A. J. Smalley et al., "Predictive Modeling Approach for Surface Water Quality: Development and Comparison of Machine Learning Models," J. Environ. Manage., vol. 230, pp. 365–378,

[5] A. S. Selim, S. N. A. Islam, M. M. Moniruzzaman, S. Shah, and M. Ohiduzzaman, "Predictive Models for Dissolved Oxygen in an Urban Lake by Regression Analysis and Artificial Neural Network," Total Environ. Res. Themes, 2023. Available: https:// doi.org/10.1016/j.totert.2023.100066

[6] J. Smith and D. W. Jones, "Correlation Between Conductivity and Total Dissolved Solids in Various Types of Water," Water Res., vol. 45, no. 4, pp. 1483–1495, 2018

[7] J. P. N. and M. S. Vijaya, "River Water Quality Prediction and Index Classification Using Machine Learning," J. Phys.: Conf. Ser., vol. 2325, no. 1, 2022. Available: https:// doi.org/10.1088/1742-6596/2325/1/012011

[8] H. L. Chan et al., "Correlation between Electrical Conductivity and Total Dissolved Solids in Natural Waters," Int. J. Environ. Sci. Technol., vol. 16, no. 2, pp. 729–738, 2019

[9] J. M. Z. Hoque, N. A. Ab. Aziz, S. Alelyani, M. Mohana, and M. Hosain, "Improving Water Quality Index Prediction Using Regression Learning Models," Int. J. Environ. Res. Public Health, vol. 19, no. 20, p. 13702, 2022. [Online]. Available: https://www.mdpi.com/1660-4601/19/20/13702

[10] D. AM et al., "Prediction of Water Quality Parameters of River Periyar Using Regression Models," in Proc. 2nd Int. Conf. Adv. Comput. Innov. Technol. Eng. (ICACITE), Greater Noida, India, 2022, pp. 53–57. Available: https:// doi.org/10.1109/ICACITE53722.2022.9823774

[11] S. Palab?y?k and T. Akkan, "Evaluation of water quality based on artificial intelligence: performance of multilayer perceptron neural networks and multiple linear regression versus water quality indexes," Environ. Dev. Sustain., 2024. Available: https:// doi.org/10.1007/s10668-024-05075-6

[12] B. Sharma and H. Kaur, "Parameters of Water to be Predicted using Regression Analysis," in Proc. 11th Int. Conf. Syst. Model. Adv. Res. Trends (SMART), Moradabad, India, 2022, pp. 970–976. Available: https:// doi.org/10.1109/SMART55829.2022.10046732

[13] C. Lavanya, M. Nikitha, N. Swetha, K. Nikhitha, and L. Hussein, "Assessment and estimation of water quality using multi-linear regression," E3S Web Conf., vol. 529, p. 03007, 2024. [Online]. Available: https://doi.org/10.1051/e3sconf/202452903007

[14] National Water Quality Standards for Malaysia, [Online]. Available: https://doe.gov.my/wp-content/uploads/2021/11/Standard-Kualiti-Air-Kebangsaan.pdf (accessed Feb. 28, 2025)

[15] C. E. Boyd, "Water Quality Regulations," in Water Quality: An Introduction, Cham: Springer Int. Publ., 2015, pp. 339–352

[16] G. F. Lee and A. Jones-Lee, "Clean Water Act, Water Quality Criteria/Standards, TMDLs, and Weight-of-Evidence Approach for Regulating Water Quality," in Water Encyclopedia, pp. 598–604

[17] J. W. Moore, "Water Quality Guidelines and Standards," in Balancing the Needs of Water Use, New York, NY: Springer New York, 1989, pp. 244–254

[18] DATAtab Team, DATAtab: Online Statistics Calculator, Graz, Austria, 2024. [Online]. Available: https://datatab.net

[19] F. B. Oppong and S. Y. Agbedra, "Assessing univariate and multivariate normality. A guide for non-statisticians," Math. Theory Model., vol. 6, no. 2, pp. 26–33, 2016

[20] F. F. de Campos, O. A. B. Licht, and N. B. F. Campos, "PPlot, a webapp to partition geochemical data and isolate mixed subpopulations using probability plot modeling," Geochim. Brasiliensis, vol. 37, p. e-23002, Sep. 2023. Available: https:// doi.org/10.21715/GB2358-2812.202337002

[21] R. Jiang, P. Li, and K. Zhang, "Quantile-Quantile Plot of Folded-Normal Distribution and its Applications in Reliability and Quality Modeling," in Proc. 10th Int. Symp. Syst. Secur., Safety, and Reliability (ISSSR), Xiamen, China, 2024, pp. 44–50. Available: https:// doi.org/10.1109/ISSSR61934.2024.00011

[22] M. S. Sameera and G. R. Kancharla, "The Selection of Best Fit Model Involving Correlation in Examination with QQ Plot," in Proc. Int. Conf. Edge Comput. Appl. (ICECAA), Tamilnadu, India, 2022, pp. 814–817. Available: https:// doi.org/10.1109/ICECAA55415.2022.9936255

[23] Q.-Y. Peng, J.-J. Zhou, and N.-S. Tang, "Varying coefficient partially functional linear regression models," Stat. Papers, vol. 57, no. 3, pp. 827–841, Sep. 2016, doi: 10.1007/s00362-015-0681-3

[24] Y. C. Wu, J. Q. Fan, and H. G. Müller, "Varying-coefficient functional linear regression," Bernoulli, vol. 16, no. 3, pp. 730–758, Aug. 2010. Available: https:// doi.org/10.3150/09-bej231

[25] S. G. Schreiber et al., "Statistical tools for water quality assessment and monitoring in river ecosystems – a scoping review and recommendations for data analysis," Water Qual. Res. J., vol. 57, no. 1, pp. 40–57, Feb. 2022. Available: https:// doi.org/10.2166/wqrj.2022.028