| dc.description.abstract |
Selecting learning machines such as classifiers is an important aspect when it is used for diagnosis in the clinic. K-fold cross-validation is a practical technique that allows simple inference of such learning machines. However, the classical recipe generates many models and does not provide a concrete means to determine the best one. This includes the selection of the best fold, as well as the selection of the best features. In this paper, a modified recipe is presented, that generates more consistent machines with similar on-average performance, but less extra-sample loss variance and less feature bias. The originality introduced is the pooling of cross-validation results over the K folds. Then, the best feature may be selected, based on the pooled validation results. The best fold is selected only at the end. A use case is provided by applying the recipe onto the atrial flutter localization problem. Both classic and modified recipes produced machines with comparable performance (median normalized loss 0.44 vs. 0.42, classic vs. pooled) but more consistent variation in pooled KFCV (interquartile range 0.17 vs. 0.09, classic vs. pooled). |
en_US |