In this study we describe a label-free shotgun approach to establish a proteomics workflow for the identification of the protein changes occurring during citrus fruit development. We analyzed and compared juice sac cells extracted from fruits at three stages of development. The end of Stage I (early Stage II), characterized by extensive cell division; Stage II, where cell division ceases and the juice cell sacs expand with the accumulation of large amounts of solutes and water; and Stage III, where the fruit matures and ripens [55, 56]. It should be noted that it was practically impossible to extract juice sac cell proteins at Stage I (fruit diameter ≈10-15 mm) because at this stage the juice sac cells are not well developed.
Comparative proteomics studies in plants are still lagging behind studies done in mammalian cells and are predominantly performed by employing 2DE-gels . Although differential proteomics studies employing label-free quantification have been published during the last few years [9, 10, 24], in plants these studies are scarce [26, 43].
In order to employ an efficient proteomics study in citrus, a plant species lacking a full sequenced genome, we established a workflow that dealt with few of the problems arising from using a ESTs database. We created iCitrus, a database and interface that collected sequences from three different sources, HarvEST:Citrus http://harvest.ucr.edu/, NCBI's Citrus unigenes and NCBI's Citrus proteins http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&id=2711&lvl=3&lin=f&keep=1&srchmode=1&unlock to create one unified database with reduced redundancy for mass spectra search. iCitrus was created to provide a compact database for the identification of citrus proteins and a more accurate quantitative expression measurements. The iCitrus interface enabled a fast identification of lists of accessions including Arabidopsis homologs, and the use of bioinformatics tools such as MapMan, AraCyc and Cytoscape (Katz et al. in preparation).
The iCitrus resource is essentially an interface that can be used to access pre-calculated Blast results. iCitrus itself does not make or summarize GO assignments based on rules that weight GO terms from various hits; this is the (perfectly reasonable) philosophy behind Blast2GO and related tools. We chose to allow users, instead of iCitrus, to determine if they trust and adopt particular annotations or not. We took this approach to allow individual users to use specific knowledge of protein families or taxonomical differences (i.e. Citrus versus Arabidopsis) to influence their interpretation of the BLAST results. In addition, there may be cases in which GO annotation is absent in the BLAST results against Arabidopsis or Viridiplantae, but a consensus could emerge from the descriptive text accompanying a hit. We think this combined approach of manual annotation with the assistance of pre-computed BLAST results is more effective when predicting functional information for a not well-annotated organism like Citrus.
Two widely used, but fundamentally different, label-free methods for quantification were used in this study; peak integration (dMS) and spectral counting (SC). For dMS, we used a two-fold change as a threshold for differential expression of the identified proteins  and a Bayes factor of 10 for spectral counting . Such a stringent threshold is needed because the protein ratios are calculated by averaging the intensity weight of peptide ratios, and because the number of peptides identifying each protein is highly variable. In most cases, both methods identified similar proteins with some discrepancies (Figure 4a). These discrepancies derived from the way SIEVE (for dMS) and Scaffold (for SC) handled the peptides information. Scaffold is able to identify peptides in similar proteins and group them together, thus identifying database redundancy, on the other hand, SIEVE does not group similar proteins. When we compare the number of identified proteins by the two methods using the corresponding Arabidopsis homologs of each iCitrus accession identified (Figure 4b) the differences decreased significantly, particularly for dMS (Figure 4). Yet, additional redundancy could arise from possible gene families in Citrus. The wide range of Citrus species used to create HarvEST:Citrus database including Citrus sinensis, Citrus paradise, Citrus unshiu, C. reticulata, C. jambhiri, C. aurantium, C. clementina, C. macrophylla and Poncirus trifoliate, consists of sequences that are similar but not identical therefore were not screened out from the iCitrus dataset. In addition, some of the sequences in the database that might originate from the same unigene did not overlap therefore could not be assembled, contributing to the difference in number of proteins identified (Table 1).
Currently, non-overlapping sequences cannot be assembled until more ESTs can be produced to cover the missing gaps or until the Citrus genome is fully sequenced . A significant number of proteins (144 in dMS and 118 in SC in the Stage II vs. early Stage II comparison, and 119 in dMS and 255 proteins in SC, in the Stage III vs. Stage II comparison) were identified by only one of the methods due to the inherent differences of dMS and SC workflows. SEQUEST and SIEVE (dMS workflow) use protein probability cut-off based on false discovery rate (FDR) according to the Decoy method . X!Tandem, Scaffold and Qspec (SC workflow) use peptide identification probability criteria as specified by the Peptide Prophet algorithm . The different workflows affect some of the proteins identification. The performance of the SC method depends strongly on the depth of the MS/MS sampling because ratios by SC are most significant for proteins with large numbers of product ion spectra, while ratios by dMS are most significant for proteins with large numbers of overlapping peptide ions . This also explains the higher percentage of proteins that were found to be significantly different by dMS and not significant by SC (Figures 3, 5a). Therefore, dMS provides more accurate measurements of compared samples while SC is faster and easier to use. Our data show that dMS is more accurate in measuring differences in protein expression . dMS provide rich information of the LC-MS data but requires a massive computational effort to be spent on processing the data including background filtering, peak frame detection and alignment [62, 63]. Spectral counting is conceptually simpler and can be as sensitive as dMS in terms of detection range while retaining linearity [25, 30, 64]. Nevertheless, SC is less accurate in detecting differences in protein expression, in particular for less abundant proteins. Our results clearly show that the integrated use of both methods for quantification increases the power for detecting changes in shotgun proteomics experiments, and that both methods should be use in combination to gain insight of the complex protein network and a complete identification of its components.
Changes in a large number of small GTPases were identified during citrus fruit development. The expression of a relatively large number of members of the RAB, ARF, RHO and RAN families of small GTPases changed during the different stages. Although we cannot assign specific roles to all of these proteins, they clearly indicate a different role(s) of these members during the stages of citrus juice sac cell development. Vesicular trafficking is essential for fruit development [65–67]. During the Stage I there is intensive cell division . Cytoskeleton elements (actins, tubulins, etc.) together with small G-proteins and coatomer complexes are vital to cell division, cell plate formation, cell polarity, etc. . The expression of many of these proteins decreased during the transition from early Stage II to Stage II. This correlated well with the attenuation of cell division in the growing fruit and the prevalence of cell expansion. This notion was reinforced by the notable increase in expression of other small GTPases, auxiliary proteins and cytoskeletal components. Similar to the small G-proteins, changes in the expression of proteins associated with vesicular movements, docking and fusion were seen. In addition to different SNAREs (Qa, Qb, Qc, syntaxins, etc.), there was changes in COPI coatomers, clathrin, dynamin, and others suggesting the occurrence of endocytosis, exocytosis and vesicular trafficking during fruit development. Notably, while the expression of plasma membrane-associated H+-ATPases did not change during the early stages of development, changes in endosomal-associated H+-ATPases (V-type) paralleled the changes seen in the secretory and vesicular trafficking machinery. V-type ATPases and organellar acidification is essential for vesicular trafficking along exocytotic and endocytotic pathways [69, 70].
Although significant changes in sugar contents and sugar homeostasis are expected during fruit development [71, 72], changes in expression of only two putative vacuolar monosaccharide transporters (TMT1 and TMT2) were noted. A plausible explanation is that the expression of other sugar transporters did not change (although they could have been modified by post-translational mechanisms). In support of this notion, Etxeberria et al. [73, 74] demonstrated a mechanism of sugar transport into the juice sac cells and sucrose into the vacuoles that is mediated by endocytosis and intracellular vesicular trafficking. The protein inventory developed in this work, provides a preliminary glance at the function(s) of these proteins during the different stages of fruit development and in particular during cell division (Stage I, early Stage II) and cell expansion (Stage II) and assimilate mobilization, sugar accumulation and processes regulating fruit maturation and ripening.
In conclusion, we developed a workflow for the analysis and identification of proteins during fruit development in citrus, a non-model plant, using comparative label-free shotgun proteomics. We established iCitrus, a comprehensive sequence database by merging three major sources of sequences and improving the annotation of existing unigenes. iCitrus provided a useful bioinformatics tool for the high throughput identification of citrus proteins. Two methods for label-free based shotgun proteomics were used and compared; peak integration (or differential mass-spec) and spectral counting. We have identified approximately 1500 citrus protein accessions expressed in fruits and quantified their expression changes during fruit development. Our results showed that both methods can provide significant information on protein changes, with dMS providing higher accuracy. Our results clearly suggest that dMS and SC are matching, broadening the identification spectrum and providing complementary data on the change trends during the particular processes being compared.