Skip to main content

Table 2 Predictive capacity of models as determined by cross-validation

From: Data mining of plasma peptide chromatograms for biomarkers of air contaminant exposures

Classification

Dataset

Validation method

0.50% peak shift window

4.0% peak shift window

   

GA

SVM

GA

SVM

Air vs. EHC-93

dewarped

One out

78.41

70.08

79.17

66.67

  

Random

65.83

60.83

71.51

73.33

  

K-folds

73.86

61.36

70.83

70.83

 

non-dewarped

One out

a

a

57.73

52.23

  

Random

a

a

61.46

55.21

  

K-folds

a

a

47.73

48.64

0 h vs. 24 h

dewarped

One out

72.83

81.82

66.67

87.50

  

Random

72.92

76.04

58.33

87.50

  

K-folds

72.73

72.73

81.44

83.58

 

non-dewarped

One out

a

a

54.55

72.73

  

Random

a

a

68.75

66.67

  

K-folds

a

a

63.64

72.73

  1. Reliability (future predictive capacity) of discriminatory models generated using dewarped and non-dewarped datasets as determined by cross validation using the One-out, the Random or the K-folds methods. Values provided are the percent averages of the predictive capacities of a number of models generated by different combinations of chromatograms as training (model generation) and test (model validation) data. GA and SVM were used in model generation. Cross validation was not possible when there was an insufficient number of recalibratable chromatograms within an exposure or time of recovery group, and is indicated by a 'a'.