Skip to main content

Table 2 Features used for protein function classification

From: Identification of protein functions using a machine-learning approach based on sequence-derived properties

 

Feature

Description

Dimension

1

Number of amino acids

Number of residues in each protein

1

2

Molecular weight

Molecular weight of the protein

1

3

Theoretical pI

The pH at which the net charge of the protein is zero (isoelectric point)

1

4

Amino acid composition

Percentage of each amino acid in the protein

20

5

Positively charged residue_2

Percentage of positively charged residues in the protein (lysine and arginine)

1

6

Positively charged residue_3

Percentage of positively charged residues in the protein (histidine, lysine, and arginine)

1

7

Number of atoms

Total number of atoms

1

8

Carbon

Total number of carbon atoms in the protein sequence

1

9

Hydrogen

Total number of hydrogen atoms in the protein sequence

1

10

Nitrogen

Total number of nitrogen atoms in the protein sequence

1

11

Oxygen

Total number of oxygen atoms in the protein sequence

1

12

Sulphur

Total number of sulphur atoms in the protein sequence

1

13

Extinction coefficient_All

Amount of light a protein absorbs at a certain wavelength (assuming ALL Cys residues appear as half cysteines)

1

14

Extinction coefficient_No

Amount of light a protein absorbs at a certain wavelength (assuming NO Cys residues appear as half cysteines)

1

15

Instability index

The stability of the protein

1

16

Aliphatic index

The relative volume of the protein occupied by aliphatic side chains

1

17

GRAVY

Grand average of hydropathicity

1

18

PPR

Percentage of continuous changes from positively charged residues to positively charged residues

1

19

NNR

Percentage of continuous changes from negatively charged residues to negatively charged residues

1

20

PNPR

Percentage of continuous changes from positively charged residues to negatively charged residues or from negatively charged residues to positively charged residues

1

21

NNRDist (x, y)

Percentage of NNR from x to y (local information)

10

22

PPRDist (x, y)

Percentage of PPR from x to y (local information)

10

23

PNPRDist (x, y)

Percentage of PNPR from x to y (local information)

10

24

Charged

Physicochemical property

1

25

Negatively charged residues

Percentage of negatively charged residues in the protein

1

26

Polar

Physicochemical property

1

27

Aliphatic

Physicochemical property

1

28

Aromatic

Physicochemical property

1

29

Small

Physicochemical property

1

30

Tiny

Physicochemical property

1

31

Bulky

Physicochemical property

1

32

Hydrophobic

Physicochemical property

1

33

Hydrophobic and aromatic

Physicochemical properties

1

34

Neutral, weakly and hydrophobic

Physicochemical properties

1

35

Hydrophilic and acidic

Physicochemical properties

1

36

Hydrophilic and basic

Physicochemical properties

1

37

Acidic

Physicochemical property

1

38

Polar and uncharged

Physicochemical properties

1

39

Amino acid pair ratio

Percentage compositions for each of the 400 possible amino acid dipeptides

400

 

Total

 

484