METHOD FOR MONITORING THE PHYSICAL AND CHEMICAL PROPERTIES OF LIQUIDS BASED ON MACHINE LEARNING
Keywords:
Machine learning, XGBoost, multi-output regression, spectroscopy, ATR method, drinking water, physicochemical parametersAbstract
This article presents an approach developed for real-time monitoring of the physical and chemical parameters of drinking water using machine learning (ML) algorithms and Attenuated Total Reflection (ATR) spectroscopy. Although traditional chemical analysis methods provide high accuracy, they are time- and resource-intensive and do not allow automated monitoring. The proposed methodology employs spectral absorption values combined with multi-output regression (Multioutput Regressor) and the XGBoost model to simultaneously predict DOC, NH₄, PO₄, SO₄, NO₃, and NO₂ parameters. The results demonstrate that the XGBoost algorithm ensures the highest accuracy and stability, making it a reliable approach for assessing drinking water quality.
References
1. C. Pasquini, “Near Infrared Spectroscopy: Fundamentals, practical aspects and analytical applications,” Journal of the Brazilian Chemical Society, vol. 14, no. 2, pp. 198–219, 2003.
2. Kенгесбаев С.K. СОВЕРШЕНСТВОВАНИЕ ОПТИКОЭЛЕКТРОННЫХ СИСТЕМ УПРАВЛЕНИЯ НА БАЗЕ НПВО С ИСПОЛЬЗОВАНИЕМ МАШИННОГО ОБУЧЕНИЯ. Innovations in Science and Technologies. (2024). Innovations in Science and Technologies, 6, 166. Volume 1, ISSN: 3030-3451. С 166-180.
3. P. R. Griffiths and J. A. de Haseth, Fourier Transform Infrared Spectrometry. 2nd ed. Hoboken, NJ: John Wiley & Sons, 2007.
4. Н. Р. Рахимов, В. А. Жмудь, В. А. Трушин, И. Л. Рева, И. А. Сатволдиев, “Оптоэлектронные методы измерения и контроля технологических параметров нефти и нефтепродуктов,” Автоматика и программная инженерия, №2(12), С 85–98, 2015.
5. Paliwal, A.; Subramanian, G.; Ramsundar, B.; Pande, V. MolPROP: Predicting Multiple Molecular Properties Simultaneously using Language and Graph Representations. J. Cheminf. 2024, 16 (1),46.
6. M. Wiens, A. Verone-Boyle, N. Henscheid, J. T. Podichetty, and J. Burton,
“A Tutorial and Use Case Example of the eXtreme Gradient Boosting (XGBoost) Artificial Intelligence Algorithm for Drug Development Applications,” Clinical and Translational Science, vol. 18, no. 3, pp. e70172, 2025. doi: 10.1111/cts.70172
7. D. Chicco, M. J. Warrens, and G. Jurman, “The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation,” PeerJ Computer Science, vol. 7, p. e623, Jul. 2021, doi: 10.7717/peerj-cs.623.


