Comparative shotgun proteomic analysis of Clostridium acetobutylicum from butanol fermentation using glucose and xylose

Background Butanol is a second generation biofuel produced by Clostridium acetobutylicum through acetone-butanol-ethanol (ABE) fermentation process. Shotgun proteomics provides a direct approach to study the whole proteome of an organism in depth. This paper focuses on shotgun proteomic profiling of C. acetobutylicum from ABE fermentation using glucose and xylose to understand the functional mechanisms of C. acetobutylicum proteins involved in butanol production. Results We identified 894 different proteins in C. acetobutylicum from ABE fermentation process by two dimensional - liquid chromatography - tandem mass spectrometry (2D-LC-MS/MS) method. This includes 717 proteins from glucose and 826 proteins from the xylose substrate. A total of 649 proteins were found to be common and 22 significantly differentially expressed proteins were identified between glucose and xylose substrates. Conclusion Our results demonstrate that flagellar proteins are highly up-regulated with glucose compared to xylose substrate during ABE fermentation. Chemotactic activity was also found to be lost with the xylose substrate due to the absence of CheW and CheV proteins. This is the first report on the shotgun proteomic analysis of C. acetobutylicum ATCC 824 in ABE fermentation between glucose and xylose substrate from a single time data point and the number of proteins identified here is more than any other study performed on this organism up to this report.


Introduction
Clostridium acetobutylicum is a gram positive, spore forming, obligate anaerobic bacteria and is one of the few microorganisms capable of converting a wide variety of sugars into three main products acetone, butanol and ethanol (ABE) [1]. ABE fermentation process was the primary source of butanol for over 40 years until the mid-1950s and is one of the oldest large-scale industrial fermentations [2]. ABE fermentation could not compete with the chemical synthesis of ABE solvents from petroleum since the mid-1950s [3]. However, increased concern over depletion of fossil fuels has led to renewed research interest in producing solvents via microbial fermentation processes.
Lignocellulosic biomass is an abundant renewable resource that can be used for the production of alternative fuels [4]. It is advantageous to use lignocellulosic biomass such as rice straw, wheat straw, corn stover and agricultural residues for biofuel production as they have limited impact on food supplies [5]. Glucose is the most abundant sugar found in lignocellulosic biomass with xylose being the second most abundant sugar [6]. C. acetobutylicum is able to ferment several pentose and hexose sugars [7] but the rate of uptake of the hexoses exceeds that of the pentoses [8]. Moreover, good solvent yields are obtained from glucose substrate whereas significantly lower values are found with xylose substrate utilized ABE fermentation [9]. The concurrent use of both sugars is a desirable characteristic of ABE fermentation from an economic point of view [10], proposing a substantial scope for investigation in C. acetobutylicum.
Classical product improvement strategies have been carried out by genetic manipulation and metabolic engineering of C. acetobutylicum for increased solvent production during ABE fermentation [11][12][13]. The physical and genetic map of C. acetobutylicum ATCC 824 has been constructed [14] and its genome was sequenced, elucidating 3.94-Mb chromosome and 192-kb megaplasmid that contains the majority of genes responsible for solvent production [15]. Primary annotation of C. acetobutylicum ATCC 824 categorized 3848 protein coding genes that include 2886 genes with assigned roles, 346 genes without any assigned roles, 575 conserved hypothetical genes and 41 hypothetical genes according to the comprehensive microbial resource [http://cmr. jcvi.org/cgi-bin/CMR/CmrHomePage.cgi].
The potential for improving the ABE fermentation process lies in the ability to gain a more complete understanding of C. acetobutylicum. Proteomics is a powerful tool to study the cellular mechanisms at the protein level and to understand the potential functions predicted by genome and transcriptome approaches. In turn, the proteomic knowledge can be used as targets for genetic and metabolic engineering [16,17]. Advances in the development of mass spectrometry have led to the possibility of studying the proteome of an organism. The objective of this work was to study the complete proteome of C. acetobutylicum from a single data point during ABE fermentation using glucose and xylose substrates by mass spectrometry (MS) based shotgun proteomics approach which relies on the identification of all proteins in a lysed cell mixture without the need for gel based separation techniques. Furthermore, this work provides a high throughput technique to study the C. acetobutylicum proteome, in addition to a valuable dataset of C. acetobutylicum proteins, thus providing a better understanding of the functional mechanisms of butanol production from glucose and xylose substrate in the ABE fermentation process.

Materials and methods
Strain and fermentation development C. acetobutylicum ATCC-824 was obtained from American Type Culture Collection (ATCC, Cedarlane Labs, Burlington, Ontario, Canada) and was cultured using reinforced clostridial medium (RCM) in an anaerobic chamber (Coy Laboratory Products Inc., Grass Lake, Michigan, US) at 37°C for 20-24 h. Shake flask fermentation of C. acetobutylicum was performed in 250 ml anaerobic flask containing 100 ml of media consisting of (g/L) yeast extract (5), ammonium acetate (2) (30) [18]. Shake flask fermentation was also performed using xylose of 30 g/L with the same media composition except glucose. Before inoculation, the medium was autoclaved at 121°C for 15 min (Cysteine HCl.H 2 O was filter sterilized through 0.45 μm filter and added to the medium) and cooled to 35°C in anaerobic chamber. The cell suspension was incubated at 37°C with shaking at 120 rpm and the growth was monitored with OD 600 nm . Samples of 10 ml were harvested at the late exponential phase from the start of the inoculation from each fermentation experiment for further proteomic analysis. All chemicals used in this study were supplied from Fisher (Fisher Scientific, Canada) and Sigma (Sigma-Aldrich, Canada), unless otherwise specified.

Cell lysis & protein extraction
The microbial cell pellets (~100 mg wet mass) from fermentation broth were processed through single tube whole cell lysis and protein digestion. Briefly, the cell pellet was resuspended in 6 M guanidine/10 mM dithiothreitol (DTT) with 5 0 mM Tris/10 mM CaCl 2 at pH 7.6 by vortexing every 10 min for the first hour and incubated at 37°C for 12 hrs to lyse cells and extrude proteins. The guanidine concentration was diluted with six-fold 50 mM ris buffer/10 mM CaCl 2 and 5-10 μg sequencing grade trypsin (Promega, Madison, WI, USA) was added and incubated at 37°C for 12 hrs to digest proteins to peptides. A second aliquot of the same amount of sequencing grade trypsin was added and incubated at 37°C for another 6 hrs to ensure the digestion process. 1 M DTT was added to a final concentration of 20 mM and incubated for another hour with gentle rocking at 37°C. The complex peptide solution was centrifuged at 10,000 g for 10 min to remove cellular debris and the supernatant was collected, cleaned using Sep-Pak plus (Waters Limited, Mississauga, Ontario, Canada) and concentrated. For each LC-MS/ MS analysis below,~1/4 of the total sample was used based on the protocol followed by Verberkmoes et al [19].

Mass Spectrometry
Samples were analyzed in technical duplicates through a 2D nano-LC MS/MS system with a split-phase column (~3-5 cm SCX and 3-5 cm C18) (Polymicro technologies, Phoenix, AZ) [20] on a LTQ (ThermoFisher Scientific, San Jose, CA, USA) with 22 h runs [21,22]. The LTQ settings were as follows: all data-dependent MS/ MS in LTQ (top five), two microscans for both full and MS/MS scans, centroid data for all scans and two microscans averaged for each spectrum, dynamic exclusion set at 1.

Proteome informatics
All MS/MS spectra were searched with the SEQUEST algorithm [23] against a C. acetobutylicum Uniprot proteome databases [24] and filtered with DTASelect/Contrast [20] at the peptide level (Xcorrs of at least 1.8 [+1], 2.5 [+2], 3.5 [+3]). Only proteins identified with two fully tryptic peptides from a 22 h run were considered for further biological study. An in-house script was used to extract protein identifications, peptides, spectra, and sequence coverage from DTASelect filtered output files and used in calculation of protein abundance determination.

False positive rate
The overall false positive rate (FPR) was estimated by doubling the number of peptides found from the reverse database and dividing the result by the total number of identified peptides from both databases using the formula: % fal = 2[n rev /(n rev + n real )]*100 where % fal is the estimated false positive rate, n rev is the number of peptides identified from the reverse database and n real is the number of peptides identified from the real database [25].

Relative protein abundance
The relative abundances of thousands of proteins identified during MS analysis were estimated by calculating the normalized spectral abundance factors (NSAF). The NSAF for a protein is the number of spectral counts (SpC, the total number of MS/MS spectra) identifying a protein, divided by the protein's length (L), divided by the sum of SpC/L for all proteins in the experiment [26,27].

Results
Shotgun proteomics of C. acetobutylicum from ABE fermentation ABE fermentation of C. acetobutylicum ATCC 824 using glucose and xylose substrate were examined. Growth profiles of the two substrates were recorded by measuring the optical density (OD) of biomass at 600 nm and plotted against time (Additional file 1). Glucose was found to be preferred carbon source for C. acetobutylicum with the total biomass concentration reaching the peak OD 600 of 1.76 in 30 h when compared to the xylose substrate with the total biomass concentration reaching the peak OD of 1.61 in 42 h. This demonstrated that C. acetobutylicum ATCC 824 could not utilize xylose substrate as efficient as glucose substrate utilized ABE fermentation process. In general, ABE fermentation undergoes acidogenesis in the early exponential phase and a major metabolic shift takes place which then switches to solventogenesis at the end of the exponential growth phase [28]. Proteome analysis of C. acetobutylicum was carried out from the samples collected at the late exponential phases of glucose and xylose utilized ABE fermentation.
Our results present the first large scale investigation of the C. acetobutylicum proteome from a single time data point during ABE fermentation process using either glucose or xylose substrates by shotgun proteomics approach. The shotgun approach used enabled us to detect proteins by matching peptide mass data to available genome sequence databases. All proteins in the non-redundant Uniprot proteome database [http://www. uniprot.org] using keyword "C. acetobutylicum" that could match with the same set of peptides were included in the protein list. The total number of proteins identified from searching the database were 894 non redundant proteins, with 750 -950 proteins identified per sample and replicate (Table 1). A total of 717 proteins and 826 proteins were identified from the ABE fermentation using either glucose and xylose substrates respectively and 649 proteins were found to be commonly present in both the substrates ( Figure 1). The false positive rate was calculated as 4.38% and 2.84% for the first and second MS runs respectively for the ABE fermentation from the glucose substrate and 3.84% and 1.26% for the first and second MS run respectively for the ABE fermentation from the xylose substrate.

Label-free estimation of relative protein abundance
The entire lists of proteins were sorted by averaged NSAF across both samples from the glucose and xylose substrates and the technical runs (Additional file 1). Comparing the NSAF data from each sample and technical run resulted in highly reproducible data; R 2 values of 0.91 ( Figure 2) and 0.85 ( Figure 3) for ABE fermentation samples using glucose and xylose respectively. The NSAF values for ABE samples using glucose and xylose substrates were averaged among their individual technical runs and compared to determine the unique and shared proteins ( Figure 4). Based on the prediction of NSAF values, five most abundant proteins were found to be present in C. acetobutylicum from both glucose and xylose utilized ABE fermentation process. These include a heat shock protein, 60 kDa chaperonin, glyceraldehyde-3-phosphate dehydrogenase, phophocarrier protein and acetyl-CoA acetyl transferase. However, the remaining top five proteins for the glucose substrate were aldehyde-alcohol dehydrogenase, chaperone protein dnaK, 50S ribosomal protein L7/L12, fructose bisphophate aldolase, and electron transfer flavoprotein, while the remaining top five proteins for the xylose substrate were a cold shock protein, rare lipoprotein A, 10 kDa chaperonin, and two rubrerythrin proteins.

Functional categorization of identified C. acetobutylicum proteins
A total of 657 proteins out of the 894 proteins identified in the analysis were assigned to 82 pathways which can     (26), membrane transport (28), energy metabolism (21), metabolism of cofactors and vitamins (35), replication and repair (24), cell motility (17).
Our results demonstrate that the majority of the proteins involved in various C. acetobutylicum metabolic pathways were found to be commonly present with both glucose and xylose substrates. All the enzymes involved in the acid and solvent formation pathways were identified from both glucose and xylose substrates except alpha acetolactate decarboxylase. This includes pyruvate-formate lyase, acetolactate synthase, acetyl-CoA acetyltransferase, 3-hydroxybutyryl-CoA dehydrogenase, 3-hydroxybutyryl-CoA dehydratase, butyryl-CoA dehydrogenase, bifunctional acetaldehyde-CoA/alcohol dehydrogenase, NADH-dependent butanol dehydrogenase A & B, alcohol dehydrogenase, phosphate butyryltransferase, butyrate kinase, butyrate-acetoacetate CoAtransferase, acetoacetate decarboxylase, phosphotransacetylase and acetate kinase. However, proteins involved in specific processes such as carbohydrate metabolism and cell motility were found to be highly variable between the glucose and xylose utilized ABE fermentation process.
The major mechanism for carbohydrate uptake in C. acetobutylicum is the phosphotransferase (PTS) system [29] along with non-PTS transport systems which include ATP -driven transporters and other non-PTS permeases [30]. The PTS system which consists of a multiprotein complex that includes phosphocarrier protein (Hpr), phosphoenolpyruvate protein kinase (PTS system enzyme I), PTS enzyme II ABC component were identified in both glucose and xylose substrates. In addition, 14 proteins identified from both substrates were found to be involved in ABC transport system. C. acetobutylicum degrades various carbohydrates by converting them into an intermediate of one of the central carbohydrate-degrading pathways: glycolysis, pentose phosphate pathway [31,32]. Almost all the enzymes that are involved in glycolysis and pentose phosphate pathways were indentified in both glucose and xylose substrates. This includes glucokinase, glucose-6-phosphate isomerase, phosphofructokinase, fructosebisphosphate aldolase, triosephosphate isomerase, glyceraldehyde-3phosphate dehydrogenase, phosphoglycerate kinase, enolase and pyruvate kinase. However, certain enzymes such as aldose-1-epimerase, fructose bisphosphate aldolase, xylulose kinase, arabinose isomerase were identified only in xylose and not in glucose, as they were not induced when glucose was used as substrate. Furthermore, enzymes involved in galactose, starch and sucrose metabolism identified in xylose were not found to be present in glucose utilized ABE fermentation. This could be attributed to a higher number of proteins identified in xylose than glucose utilized ABE fermentation process.
Chemotaxis proteins and flagellar assembly proteins are responsible for cellular motility and are mediated by motility related gene clusters [33,34]. Chemotaxis proteins indentified from both substrates include methyl accepting chemotaxis protein (MCP), chemotaxis histidine kinase -CheA, methylesterase -CheB/methylase -CheR, chemotaxis protein -CheC, chemotaxis signal receiving protein, chemotaxis response regulator -CheY. However, chemotaxis protein -CheV and chemotaxis signal transduction protein CheW were identified only in glucose and not in xylose substrate. The chemotaxis system senses the changes in pH, temperature, nutrient, toxin concentration, etc. and methyl accepting chemotaxis protein relay the detected environmental signals to the histidine kinase-CheA protein through the CheW coupling protein causing autophosphorylation (CheA-P). This phosphoryl group from CheA-P is transferred to response regulator-CheY that interacts with flagellar motor protein FliY. Besides, CheA-P can transfer phosphate to the response regulator-CheB that removes methyl groups from specific glutamate residues of MCP. The chemotaxis protein-CheC is involved in the coordination of the MCPs methylation process. The FliY flagellar motor protein causes a change in the rotational direction of the flagellum resulting in swimming of the bacterium [35]. Regarding flagellar proteins, hook associated protein (flagellin family) -hag, flagellar hook associated protein -FliD, flagellar motor switch protein -FliY were identified from both substrates, whereas four proteins namely, flagellin, flagellin family protein, flagellar hook protein -FlgE, flagellar hook associated protein -FlgK were identified only in glucose and missing in the xylose substrate.

Differentially expressed proteins
Proteins that were identified from C. acetobutylicum during ABE fermentation from glucose and xylose substrates were examined for their differential expression using PatternLab software [36]. A TFold pairwise analysis of proteins identified from ABE fermentation using two different substrates were performed to pinpoint the differentially expressed proteins based on the spectral counting method ( Figure 5). A total of 22 proteins (blue-dots) were found to be differentially expressed with an absolute fold change > 2.5 which is the established fold change cut-off and p-values < 0.05 considered as statistically significant. Out of these 22 significantly differentially expressed proteins, the expression levels of 18 proteins were found to be higher from glucose substrates and 4 proteins were from xylose substrate ( Figure 6). Proteins such as cyclopropane fatty acid synthase, 50S ribosomal protein L17, signal peptidase I, queuosine biosynthesis protein, tRNA uridine 5carboxymethylaminomethyl modification protein did not meet the fold-change cut-off but were indicated as statistically different (green-dots). Moreover, 12 proteins (orange-dots) met the fold-change cut-off but cannot be claimed to be statistically different and about 407 proteins (red-dots) did not satisfy the fold-change or the statistical cut-offs.
Aldehyde-alcohol dehydrogenase enzyme and 3-hydroxybutyryl-CoA dehydrogenase enzymes that were directly involved in the butanoate pathway were found to be highly up-regulated in glucose utilized ABE fermentation than xylose. The bi-functional aldehyde-alcohol dehydrogenase (AAD) protein is involved in the catalysis of the two step conversion of butyryl-CoA to butanol or of acetyl-CoA to ethanol [13]. 3-hydroxybutyryl-CoA dehydrogenase (HBD) enzyme in the central fermentation pathway is vital for the production of both acid and solvent. HBD catalyzes the reduction of acetoacetyl-CoA by NAD(P)H which is an initial and important process for the ultimate production of butyrate and butanol [37]. Similarly, pyruvate formate lyase (PFL) enzyme involved in the butanoate metabolism which converts pyruvic acid to formic acid [38] was also found to be upregulated in glucose than xylose substrate. Figure 5 TFold pairwise analysis of proteins identified from ABE fermentation using glucose and xylose substrates. Each protein is represented as a dot and is mapped according to its log2 (fold change) on the ordinate axis and its -log2 (t-test p-value) on the abscissa axis. Refer to the text for the differentially expressed protein details Proteins related to cellular motility were found to be up-regulated in glucose utilized ABE fermentation than xylose. These include, flagellin family -possible hook associated protein (FlaC), flagellar switch protein -contains CheC like domain (FliY) and methyl accepting chemotaxis like protein. The FlaC protein is one of the four proteins involved in flagellin structure that have been identified within a flagellar locus [39]. FliY is a flagellar motor switch protein responsible for the swimming of the bacterium by causing a change in the rotational direction of the flagellum. Methyl accepting chemotaxis proteins (MCPs) are transmembrane receptors that functions as a chemotaxis sensory transducer, transmitting the signal from the binding protein to the twocomponent system consisting histidine kinase (CheA) and the response regulator (CheY) [35].
Interestingly, rare lipoprotein A (RLPA) was found to be up-regulated in xylose but not the glucose substrate. Most lipoprotein act as membrane chaperones preventing unproductive interactions with the cell wall [40]. In Figure 6 Differentially expressed proteins identified from ABE fermentation between glucose and xylose substrates based on the spectral counting method with the spectral score shown at the end of each bar.
Sivagnanam et al. Proteome Science 2011, 9:66 http://www.proteomesci.com/content/9/1/66 addition, cold shock protein, protein containing cell adhesion domain and a predicted membrane protein were also found to be up-regulated in xylose substrate but not in glucose. However, not surprisingly, phosphoenolpyruvate-protein kinase (PTS system enzyme I) which is a membrane associated protein [41] was found to be up-regulated in glucose but not xylose. This observation is not unexpected as the specificity of C. acetobutylicum phosphotransferase (PTS) system varies for different sugars and glucose is preferred when compared to xylose [42]. A set of proteins which includes phosphoribosylaminoimidazole-succinocarboxamide synthase, GMP synthase, amidophosphoribosyltransferase, phosphoribosylformylglycinamidine (FGAM) synthase involved in purine metabolism were found to be upregulated in glucose substrate. In addition, a putative uncharacterized protein (Q97LK2), 2,3-bisphosphoglycerate-independent phosphoglycerate mutase involved in biosynthesis of secondary metabolites, 2,3,4,5-tetrahydropyridine-2,6-dicarboxylate N-acetyltransferase involved in microbial metabolism and other proteins listed in Figure 6 were also found to be up-regulated in glucose.

Discussion
A number of studies have been performed in C. acetobutylicum from ABE fermentation in order to achieve a better understanding of the butanol production process and limited proteome data are also available that have attempted to identify the protein composition of C. acetobutylicum. However, all proteomic studies published so far focused on acidogenic and solventogenic proteins involved in the metabolic pathways and the protein identification techniques were based on one and/or two dimensional -gel electrophoresis-mass spectroscopy method, exploring restricted number of proteins [43][44][45][46]. The proteome reference map published on C. acetobutylicum DSM 1731 strain identified about 564 proteins that accounts for 14.7% of the predicted genome and 416 proteins were used to reconstruct its metabolic network [45]. Another proteomic view published on acidogenic and solventogenic steady-state cells of C. acetobutylicum in chemostat culture identified 383 proteins [47]. In contrast, whole proteome investigation of C. acetobutylicum during ABE fermentation process have not been analyzed yet, which constituted the motivation of this study. This study is the first report on the whole proteome analysis of C. acetobutylicum ATCC 824 during ABE fermentation using glucose and xylose substrates, identifying 894 proteins from a single time data point through MS-based shotgun proteomics approach without the need for gel-based separation or de novo sequencing techniques. This 894 proteins account for 23.2% of the predicted 3848 ORFs in the C. acetobutylicum genome that includes 168 uncharacterized proteins out of the 346 genes without any assigned roles and the number of proteins identified in this study is more than any other proteomic study published on C. acetobutylicum so far. This 23.2% coverage of C. acetobutylicum ATCC 824 proteome based on gel-free shotgun proteomics approach is higher than the proteome reference map of other organisms such as 21 [52] which used two dimensional gel electrophoresis method. The extensive range of proteins identified in this paper on C. acetobutylicum can be potentially used to study this organism in-depth at proteome level.
The number of proteins identified in this C. acetobutylicum ATCC 824 proteome analysis is more when compared to the previous proteome reference map constructed using C. acetobutylicum DSM 1731 strain [45]. Out of the 177 proteins predicted to be involved in carbohydrate metabolism, we have identified a total of 140 and 151 proteins in ATCC 824 strain from glucose and xylose utilized fermentation respectively, when compared to only 98 proteins identified from DSM 1731 strain. Similarly, the number of proteins identified in mechanisms such as nucleotide metabolism, lipid metabolism, energy metabolism, and replication and repair processes were higher in this study. This is the first study to identify most of the enzymes involved in the acid and solvent formation pathway of C. acetobutylicum ATCC 824 including the alcohol dehydrogenase enzyme which was not identified in the proteome map of DSM 1731 strain. Moreover, this study is also in accordance with the recently reported membrane proteome analysis of C. acetobutylicum DSM 1731 [46]. Proteins such as glyceraldehydes-3-phosphate dehydrogenase and chaperonin proteins were found to be the most abundant of all the proteins identified in both the studies.
The most noticeable proteins identified in this study between glucose and xylose substrate utilized ABE fermentation were the proteins involved in flagellar assembly and bacterial chemotaxis which confers the cellular motility mechanisms. Flagellin (CAC1634), flagellin family protein (CAC2167), flagellar hook protein FlgE (CAC2154) that connects the basal body to the filaments and flagellar hook associated protein FlgK (CAC2212) which forms the junction between the hook and the filament were identified only in C. acetobutylicum from glucose substrate. The swimming characteristics of flagellar systems is a strong survival advantage in order to move to a more favourable environment due to changes in the pH, temperature, nutrient, toxin concentration that are detected by chemotaxis systems [35]. Chemotaxis signalling systems are highly sensitive to chemical cues that allow bacteria to track favourable chemical gradients in their environment [33]. The two proteins, receptor kinase coupling protein -CheW (CAC2217) and CheV (CAC1233) which is a two domain chemotaxis coupling protein that consists of CheW-like portion (CheVw) plus a receiver (REC) domain [53] were identified only from glucose and not in xylose. CheV has been previously described as a Bacillus subtilis protein homologous to CheW protein [54] and studies on B.subtilis mutants lacking both CheW and CheV proteins found that the mutant strains were non-chemotactic (Che -) [55]. This suggests that the absence of CheV and CheW proteins in C. acetobutylicum grown on xylose lost the chemotactic features and is non-chemotactic when compared to glucose substrate. Therefore, C. acetobutylicum in glucose utilized ABE fermentation is able to adjust or adapt to a stimulus triggered by the chemotactic responses that sense the chemical cues such as butanol toxicity which is a major issue for the fermentative production of butanol [3]. These results correlate well with the previous work done by Gutierrez and Maddox explaining that the motility of C. acetobutylicum during fermentation is a chemotactic response [56].
Comparative proteomic analysis revealed that aldehyde-alcohol dehydrogenase enzyme (CAP0035), 3hydroxybutyryl-CoA dehydrogenase enzyme (CAC2708), flagellar motor switch protein -FliY (CAC2215) which controls the swimming of the bacterium, and flagellin family hook associated protein (CAC2203) were highly up-regulated in C. acetobutylicum from glucose utilized ABE fermentation than xylose. In addition, the NSAF values of the enzymes acetoacetyl-coenzyme A: acetate/ butyrate coenzyme A-transferase (CoA-transferase) and butyraldehyde dehydrogenase (BAD) were found to be relatively abundant only on the glucose substrate. These results are consistent with the literature which demonstrate that a highly motile inoculum results in higher solvent production and non-motility leads to no solvent due to the loss of CoA-transferase and BAD enzymes that are directly involved in solvent production [57,58]. Recent transcriptomic studies of C. acetobutylicum growing on mixtures of glucose and xylose also reported that genes for chemotaxis proteins and flagellin biosynthesis are activated through glucose [59]. Therefore, the mechanism that C. acetobutylicum uses for both glucose and xylose sugar resulting in a preference for glucose [60] could also be attributed to the high motility nature of C. acetobutylicum grown on glucose compared to less motility with xylose substrate. On the other hand, the expression of rare lipoprotein-A (CAP0058) was found to be highly up-regulated in xylose utilized ABE fermentation than glucose substrate. Studies on the transcriptional analysis of butanol stress and tolerance in C. acetobutylicum showed that butanol stress induced the expression of gene which codes for the rare lipoprotein-A [61]. This confirms that butanol stress is higher in C. acetobutylicum using xylose substrate and results in overeexpression of RLPA protein when compared to glucose substrate. Furthermore, the cold shock protein (CAC2990) was also found to be highly up-regulated in xylose compared to glucose substrate.
Overall, this study provides an efficient, high throughput and rapid technique to study the C. acetobutylicum proteome and this data serves as a base for future investigations to compare and understand the different substrate utilization and regulation of butanol production in ABE fermentation process at the proteome level. Moreover, we envision this dataset as a useful source for researchers interested in the C. acetobutylicum proteomic studies and differences between glucose and xylose utilized ABE fermentation at proteome level.

Additional material
Additional file 1: C. acetobutylicum proteins identified in this study. Glucose_Run1 -First MS run of C. acetobutylicum from ABE fermentation using glucose. Glucose_Run2 -Second MS run of C. acetobutylicum from ABE fermentation using glucose. Xylose_Run1 -First MS run of C. acetobutylicum from ABE fermentation using xylose. Xylose_Run2 -Second MS run of C. acetobutylicum from ABE fermentation using xylose. NSAF_Glucose -NSAF values for protein identified from Glucose_Run1 and 2. NSAF_Xylose -NSAF values for protein identified from Xylose_Run1 and 2. Growth curve -Growth pattern of ABE fermentation between glucose and xylose substrates.