Skip to main content

Table 4 Selection ratios for traditional and new features in the CFS method

From: Identification of protein functions using a machine-learning approach based on sequence-derived properties

Protein class

Number of selected features

Merit value

Traditional features (n = 451)

New features

(n = 33)

Transport

38

0.302

8.43%

0%

Transcription

51

0.387

11.31%

9.09%

Translation

76

0.499

16.85%

27.27%

Gluconate utilisation

59

0.59

13.08%

15.15%

Amino acid biosynthesis

52

0.309

11.53%

15.15%

Fatty acid metabolism

90

0.303

19.96%

24.24%

Acetylcholine receptor inhibitor

52

0.974

11.53%

9.09%

G-protein coupled receptor

39

0.487

8.65%

9.09%

Guanine nucleotide-releasing factor

69

0.36

15.30%

27.27%

Fibre protein

31

0.481

6.87%

3.03%

Transmembrane

35

0.443

7.76%

9.09%

  1. The merit value is the highest merit calculated for an optimal subset of the features for each class. The selected features are highly correlated with the class and have low inter-correlation with each other.