Robustness investigation of cross-validation based quality measures for model assessment

Main Article Content

Thomas Most
Lars Gräning
Sebastian Wolff

Abstract

In this paper the accuracy and robustness of quality measures for the assessment of machine learning models are investigated. The prediction quality of a machine learning model is evaluated model-independent based on a cross-validation approach, where the approximation error is estimated for unknown data. The presented measures quantify the amount of explained variation in the model prediction. The reliability of these measures is assessed by means of several numerical examples, where an additional data set for the verification of the estimated prediction error is available. Furthermore, the confidence bounds of the presented quality measures are estimated and local quality measures are derived from the prediction residuals obtained by the cross-validation approach.

Article Details

How to Cite
Robustness investigation of cross-validation based quality measures for model assessment. (2024). Engineering Modelling, Analysis and Simulation, 2(1). https://doi.org/10.59972/f5yl4dl2
Section
Articles

How to Cite

Robustness investigation of cross-validation based quality measures for model assessment. (2024). Engineering Modelling, Analysis and Simulation, 2(1). https://doi.org/10.59972/f5yl4dl2

References

Myers, R. H., and Montgomery, D. C., 2002, “Response surface methodology: process and product optimization using designed experiments,” John Wiley & Sons.

Montgomery, D. C., and Runger, G. C., 2003, “Applied Statistics and Probability for Engineers,” third ed. John Wiley & Sons.

Krige, D., 1951, “A statistical approach to some basic mine valuation problems on the Witwatersrand,” Journal of the Chemical, Metallurgical and Mining Society of South Africa, 52, pp. 119–139.

Lancaster, P., and Salkauskas, K., 1981, “Surface generated by moving least squares methods,” Mathematics of Computation, 37, pp. 141–158.

Park, J., and Sandberg, I., 1993, “Approximation and radial basis function networks,” Neural Computation, 5(2), pp. 305–316.

Smola, A., and Schölkopf, B., 2004, “A tutorial on support vector regression,” Statistics and computing, 14(3), pp. 199–222.

Hagan, M. T., Demuth, H. B., and Beale, M., 1996, “Neural Network Design,” PWS Publishing Company.

Goodfellow, I., Bengio, Y., and Courville, A., 2016, “Deep learning,” MIT press.

Herrmann, L., and Kollmannsberger, S., 2024, “Deep learning in computational mechanics: a review,” Computational Mechanics, pp. 1–51.

Ye, P., 2019, “A review on surrogate-based global optimization methods for computationally expensive functions,” Software Engineering, 7(4), pp. 68–84.

Cheng, K., Lu, Z., Ling, C., and Zhou, S., 2020, “Surrogate-assisted global sensitivity analysis: an overview,” Structural and Multidisciplinary Optimization, 61, pp. 1187–1213.

Bucher, C., and Most, T., 2008, “A comparison of approximate response functions in structural reliability analysis,” Probabilistic engineering mechanics, 23(2-3), pp. 154–163.

Moustapha, M., Marelli, S., and Sudret, B., 2022, “Active learning for structural reliability: Survey, general framework and benchmark,” Structural Safety, 96, p. 102174.

Yondo, R., Bobrowski, K., Andr´es, E., and Valero, E., 2019, “A review of surrogate modeling techniques for aerodynamic analysis and optimization: current limitations and future challenges in industry,” Advances in evolutionary and deterministic methods for design, optimization and control in engineering and sciences, pp. 19–33.

Westermann, P., and Evins, R., 2019, “Surrogate modelling for sustainable building design–a review,” Energy and Buildings, 198, pp. 170–186.

Zhang, W., Gu, X., Hong, L., Han, L., and Wang, L., 2023, “Comprehensive review of machine learning in geotechnical reliability analysis: Algorithms, applications and further challenges,” Applied Soft Computing, 136, p. 110066.

Queipo, N. V., Haftka, R. T., Shyy, W., Goel, T., Vaidyanathan, R., and Tucker, P. K., 2005, “Surrogate-based analysis and optimization,” Progress in aerospace sciences, 41(1), pp. 1–28.

Forrester, A., Sobester, A., and Keane, A., 2008, “Engineering design via surrogate modelling: a practical guide,” John Wiley & Sons.

Most, T., and Will, J., 2011, “Sensitivity analysis using the Metamodel of Optimal Prognosis,” In Proc. 8th Weimarer Optimierungs- und Stochastiktage (WOST), Weimar, Germany.

Most, T., Gräning, L., Will, J., and Abdulhkim, A., 2022, “Automatized machine learning approach for industrial application,” In Proc. NAFEMS Dach conference, Bamberg, Germany.

Most, T., Gräning, L., Wolff, S., and Cremanns, K., 2024, “Automatisierte Approximation von CAE-Signal- und Feldergebnisgrößen mit Methoden des Maschinellen Lernens,” NAFEMS Online Magazin, 70, pp. 32-40.

Jones, D. R., Schonlau, M., and Welch, W. J., 1998, “Efficient global optimization of expensive black-box functions,” Journal of Global optimization, 13(4), p. 455.

Ansys Germany GmbH, 2023, optiSLang documentation: Methods for multi-disciplinary optimization and robustness analysis.

Sobol’, I. M., 1993, “Sensitivity estimates for nonlinear mathematical models,” Mathematical Modelling and Computational Experiment, 1, pp. 407–414.

Homma, T., and Saltelli, A., 1996, “Importance measures in global sensitivity analysis of nonlinear models,” Reliability Engineering and System Safety, 52, pp. 1–17.

Most, T., 2012, “Variance-based sensitivity analysis in the presence of correlated input variables,” In Proc. 5th Int. Conf. Reliable Engineering Computing (REC), Brno, Czech Republic, June 13-15.

Saltelli, A., et al., 2008, “Global Sensitivity Analysis. The Primer,” John Wiley & Sons, Ltd, Chichester, England.

Efron, B., 1992, “Bootstrap methods: another look at the jackknife,” In Breakthroughs in statistics: Methodology and distribution. Springer, pp. 569–593.

Most, T. and Knabe, T., 2010, “Reliability analysis of the bearing failure problem considering uncertain stochastic parameters,” Computers and Geotechnics, 37(3), pp. 299–310.

Huntington, D., and Lyrintzis, C., 1998, “Improvements to and limitations of Latin hypercube sampling,” Probabilistic engineering mechanics, 13(4), pp. 245–253.

Stander, N., Basudhar, A., Gandikota, I., Liebold, K., Svedin, A., and Keisser, C., 2021, “LS-OPT status update,” In Proc. 13th European LS-DYNA Conference, Ulm, Germany.

Most, T., Rasch, M., Ubben, P. T., Niemeier, R., and Bayer, V., 2023, „A Multimodal Importance Sampling Approach for the Probabilistic Safety Assessment of Automated Driver Assistance Systems,” Journal of Autonomous Vehicles and Systems, Vol. 3, 011001-1