From: Protein sequence classification using feature hashing
Bag of fixed or variable length k-grams
non-plant
Accuracy %
# features
1-grams
71.21
20
2-grams
70.85
400
3-grams
79.80
7999
4-grams
79.03
146598
(1-2)-grams
70.56
420
(1-3)-grams
79.69
8419
(1-4)-grams
82.83
155017
(1-5)-grams
80.09
950849