Investigation of performance metrics in regression analysis and machine learning-based prediction models

Plevris, Vagelis; Solorzano, German; Bakas, Nikolaos P.; Ben Seghier, Mohamed El Amine

Plevris, Vagelis; Solorzano, German; Bakas, Nikolaos P.; Ben Seghier, Mohamed El Amine

Conference object

Published version

Åpne

fullpaper_ann-prediction-metrics_eccomas2022.pdf (989.1Kb)

Permanent lenke

https://hdl.handle.net/11250/3051984

Utgivelsesdato

2022-11-24

Sammendrag

Performance metrics (Evaluation metrics or error metrics) are crucial components of regression analysis and machine learning-based prediction models. A performance metric can be defined as a logical and mathematical construct designed to measure how close the predicted outcome is to the actual result. A variety of performance metrics have been described and proposed in the literature. Knowledge about the metrics’ properties needs to be systematized to simplify their design and use. In this work, we examine various regression related metrics (14 in total) for continuous variables, including the most widely used ones, such as the (root) mean squared error, the mean absolute error, the Pearson correlation coefficient, and the coefficient of determination, among many others. We provide their mathematical formulations, as well as a discussion on their use, their characteristics, advantages, disadvantages, and limitations, through theoretical analysis and a detailed numerical example. The 10 unitless metrics are further investigated through a numerical analysis with Monte Carlo Simulation based on (i) random guessing and (ii) the addition of random noise with various noise ratios to the predicted values. Some of the metrics show a poor or inconsistent performance, while others exhibit good performance as evaluation measures of the “goodness of fit”. We highlight the importance of the usage of the right metrics to obtain good predictions in machine learning and regression models in general.

Utgiver

European Community on Computational Methods in Applied Sciences

Serie

ECCOMAS Congress;8th European Congress on Computational Methods in Applied Sciences and Engineering

Med mindre annet er angitt, så er denne innførselen lisensiert som Navngivelse-Ikkekommersiell-DelPåSammeVilkår 4.0 Internasjonal