Skip to main content

Table 3 Features selected by CFS for each protein class

From: Identification of protein functions using a machine-learning approach based on sequence-derived properties

Protein class

Selected features

Transport

R, G, H, I, M, positively charged residue_3, carbon, CC, CD, CE, CH, CK, CN, CQ, CW, CY, FM, GW, HC, HR, IC, IG, LF, LG, LM, MF, MM, MQ, PC, QC, SC, TC, WD, YH, polar, hydrophobic, hydrophobic and aromatic, hydrophilic and basic

Transcription

D, C, Q, F, V, positively charged residue_3, sulphur, extinction coefficient_all, instability index, aliphatic index, GRAVY, NNR, PNPR, PPRDist (41,50), CC, CF, CV, CW, CY, DD, DE, EE, EF, EH, EL, FC, FF, FW, GC, HD, HH, IF, LT, MN, QQ, TL, TW, VV, WI, WV, WW, WY, charged, polar, aliphatic, aromatic, hydrophobic and aromatic, hydrophilic and acidic, hydrophilic and basic, acidic, polar and uncharged

Translation

NumOfAAs, D, L, hydrogen, GRAVY, PPR, NNR, NNRD (11,20), PPRD (31,40), PNPRD (41,50), PPRD (51,60), PNPRD (81,90), NNRD (91,100), PNPRD (91,100), AA, AG, AH, AM, AQ, CC, CE, CN, CP, DE, DH, EE, EG, EQ, FD, FK, FQ, FW, GC, GV, GW, GY, HI, IC, IP, IY, KE, KK, KR, KS, KW, LG, LK, LT, LV, LW, MM, MW, NH, PE, PK, PT, PY, QF, QN, RN, SD, TG, TK, VA, VG, VL, WC, WE, WG, WK, YD, YS, YV, charged, aliphatic, hydrophilic and acidic

Gluconate utilisation

Positively charged residue_3, instability index, aliphatic index, PNPRDist (11,20), PPRDist (21,30), PPRDist (31,40), PPRDist (81,90), PPRDist (91,100), AG, AH, AV, AW, AY, CC, CI, DG, DI, DR, EG, EW, FH, FL, FP, GC, GE, GF, GI, GK, GM, GP, GR, HN, IG, KL, KM, KW, LI, LM, MG, MM, MQ, MV, PC, PK, PN, PP, SR, SY, TD, VF, VM, WN, WR, WT, YS, YV, aromatic, hydrophilic and acidic, polar and uncharged

Amino acid biosynthesis

NumOfAAs, theoretical pI, D, C, G, S, sulphur, instability index, aliphatic index, GRAVY, PPR, NNR, NNRD (11,20), PNPRD (21,30), NNRD (91,100), CN, DC, DM, EC, EW, FP, FW, FY, GA, HP, IN, LC, MW, NF, NW, PC, PP, PS, QM, RC, RD, SC, WF, WG, WM, WN, WW, YR, YY, charged, aliphatic, tiny, bulky, hydrophobic, hydrophobic and aromatic, hydrophilic and acidic, acidic

Fatty acid metabolism

NumOfAAs, R, D, C, Q, E, G, I, F, S, negatively charged residue, positively charged residue_3, instability index, aliphatic index, GRAVY, NNR, PNPR, PPRD (00,10), PNPRD (71,80), PPRD (81,90), PNPRD (81,90), PPRD (91,100), NNRD (91,100), AH, AR, CG, CI, DC, DN, DR, EC, EY, FA, FP, GA, GG, GL, GW, HH, HI, HM, HN, HP, HT, IR, IW, KA, KH, LF, LL, MC, MG, MH, MM, MR, NA, NP, PA, PC, PP, PR, PY, QM, QN, QP, RK, RR, RS, SM, SY, TD, TR, TS, TW, VQ, VW, WG, WP, WQ, WS, WW, YG, YI, YW, YY, charged, aliphatic, hydrophobic, hydrophilic and acidic, acidic

Acetylcholine receptor inhibitor

Molecular weight, C, M, PNPRDist (00,10), NNRDist (11,20), NNRDist (71,80), AN, AT, CA, CC, CF, CN, CP, CS, DA, DF, DP, DS, EA, EI, ES, ET, FL, GC, HI, HQ, IC, II, IR, IT, KC, KE, KF, KL, KT, LD, LE, LN, LP, LQ, MK, NC, NV, RI, TC, VK, VN, VS, WC, YD, YT, tiny

G-protein coupled receptor

Theoretical pI, D, C, Q, E, G, K, F, S, T, negatively charged residue, positively charged residue_3, sulphur, PNPR, PNPRDist (11,20), NNRDist (71,80), CC, CF, CH, CW, CY, FC, FI, FL, GQ, IC, IW, IY, LC, MW, SC, WG, WV, WY, aromatic, tiny, bulky, hydrophobic and aromatic, acidic

Guanine nucleotide-releasing factor

A, Q, H, I, V, positively charged residue_2, positively charged residue_3, oxygen, instability index, aliphatic index, GRAVY, PPR, NNR, NNRDist (00,10), PPRDist (11,20), PNPRDist (21,30), NNRDist (31,40), PNPRDist (51,60), PNPRDist (61,70), NNRDist (91,100), CQ, DC, DH, EC, ED, EE, EP, EW, FN, HC, HD, HE, HH, HK, HM, HW, IV, KW, LE, LG, LK, MF, MI, PN, QC, QD, QE, QW, RL, SE, SP, TW, VG, VI, VV, WC, WD, WE, WF, WK, WS, WY, YW, hydrophobic, hydrophilic and acidic, hydrophilic and basic, acidic, polar and uncharged, polar

Fibre protein

G, M, T, positively charged residue_2, NNRDist (81,90), DN, ER, FN, GD, GG, GN, GQ, GT, IN, IP, LC, LL, LT, NA, NG, NT, PF, SQ, TA, TG, TN, TW, WK, WN, charged, polar and uncharged

Transmembrane

Theoretical pI, D, C, L, S, W, negatively charged residue, extinction coefficient_all, instability index, GRAVY, NNR, PNPR, PPRDist (71,80), AD, CC, CW, DA, EA, FC, FL, FW, LK, LL, LW, MW, PC, PP, SC, SL, TW, VD, WW, tiny, bulky, acidic