- Open Access
The Lottia gigantea shell matrix proteome: re-analysis including MaxQuant iBAQ quantitation and phosphoproteome analysis
Proteome Sciencevolume 12, Article number: 28 (2014)
Although the importance of proteins of the biomineral organic matrix and their posttranslational modifications for biomineralization is generally recognized, the number of published matrix proteomes is still small. This is mostly due to the lack of comprehensive sequence databases, usually derived from genomic sequencing projects. However, in-depth mass spectrometry-based proteomic analysis, which critically depends on high-quality sequence databases, is a very fast tool to identify candidates for functional biomineral matrix proteins and their posttranslational modifications. Identification of such candidate proteins is facilitated by at least approximate quantitation of the identified proteins, because the most abundant ones may also be the most interesting candidates for further functional analysis.
Re-quantification of previously identified Lottia shell matrix proteins using the intensity-based absolute quantification (iBAQ) method as implemented in the MaxQuant identification and quantitation software showed that only 57 of the 382 accepted identifications constituted 98% of the total identified matrix proteome. This group of proteins did not contain obvious intracellular proteins, such as cytoskeletal components or ribosomal proteins, invariably identified as minor components of high-throughput biomineral matrix proteomes. Fourteen of these major proteins were phosphorylated to a variable extent. All together we identified 52 phospho sites in 20 of the 382 accepted proteins with high confidence.
We show that iBAQ quantitation may be a useful tool to narrow down the group of functional biomineral matrix protein candidates for further research in cell biology, genetics or materials research. Knowledge of posttranslational modifications in these major proteins could be a valuable addition to previously published proteomes. This is true especially for phosphorylation, because this modification was already shown to modify mineralization processes in some instances.
Phosphorylation is one of the most widespread posttranslational modifications of proteins and also occurs in the organic matrix of biominerals [1, 2]. Protein FAM20C has recently been identified as a kinase involved in phosphorylation of such secreted proteins [3, 4], but other kinases may also be involved [5, 6]. In a few cases experimental evidence indicated an important function for phospho groups in biomineral matrix proteins. The best-examined matrix phosphoprotein in this respect is mammalian osteopontin, first described as a major non-collagenous bone protein. Among the many functions suggested for this protein since its discovery (reviewed, for instance, in [7, 8]) is also phosphorylation-dependent inhibition of mineralization processes . Removal of phospho groups by alkaline phosphatase significantly reduces its inhibitory potential in in vitro crystallization assays  and un-phosphorylated recombinant osteopontin, but not in vitro phosphorylated osteopontin, fails to inhibit mineralization of human smooth muscle cell cultures serving as a model for human vascular calcification . A crucial role of phosphorylated residues in the interaction with mineral is also reported for dentin matrix protein 1 and dentin phosphophoryn [12, 13]. The only invertebrate example so far is orchestin, a major matrix protein from crustacean calcium storage structures. Phosphorylation of orchestin is necessary for calcium binding of the protein .
The recently published genomes of biomineralizing organisms enable high-throughput mass spectrometry-based analysis of biomineral proteomes and phosphoproteomes, thus facilitating the fast identification of phosphoproteins and phosphorylation sites [15, 16]. In the present study we add the phosphoproteome of the Lottia gigantea shell matrix to the recently published Lottia shell proteomes [17, 18]. Furthermore, we have re-quantitated the Lottia shell proteome using the iBAQ (intensity-based absolute quantification) method  as implemented in MaxQuant. This showed that 57 proteins make up 98% of the total identified proteome. We suggest that quantitation allows the identification of major proteins, which are the most likely candidates for functional shell proteins, while retaining information about minor proteins, irrespective of whether these minor proteins play a role in mineralization or not, and irrespective of whether they occur intra- or extra-crystalline.
Materials and methods
Matrix and phosphopeptide preparation
Lottia shell matrix was prepared as previously described  using method B for shell cleaning (2 h sodium hypochlorite incubation with 2 × 5 min ultrasound treatment). Reduction, carbamidomethylation and enzymatic cleavage of matrix proteins were performed using a modification of the FASP (Filter-aided sample preparation) method  as outlined below. Two-mg aliquots of acid-soluble or acid-insoluble shell matrix were suspended in 300 μl of 0.1 M Tris, pH8, containing 6 M guanidine hydrochloride and 0.01 M dithiothreitol (DTT). This mixture was heated to 56°C for 60 min, cooled to room temperature, and centrifuged at 13000 rpm in an Eppendorf bench-top centrifuge 5415D for 15 min. The supernatant was loaded into an Amicon Ultra 0.5 ml 30 K filter device (Millipore; Tullagreen, Ireland). DTT was removed by centrifugation at 13000 rpm for 15 min and washing with 2 × 1vol of the same buffer. Carbamidomethylation was done in the device using 0.1 M Tris buffer, pH8, containing 6 M-guanidine hydrochloride and 0.05 mM iodoacetamide and incubation for 45 min in the dark. Carbamidomethylated proteins were washed with 0.05 M ammonium hydrogen carbonate buffer, pH8, containing 2 M urea, and centrifugation as before. Trypsin (20 μg, Sequencing grade, modified; Promega, Madison, USA) was added in 40 μl of 0.05 M ammonium hydrogen carbonate buffer containing 2 M urea and the devices were incubated at 37°C for 16 h. Peptides were collected by centrifugation and the filters were washed twice with 40 μl of 0.05 M ammonium hydrogen carbonate buffer. The peptide solution was acidified to pH 1–2 with trifluoroacetic acid (TFA) and peptides were vacuum-dried in an Eppendorf concentrator.
Phosphopeptides were enriched by reversible binding to TiO2 beads (Titansphere 10 μm, GL Sciences, Japan) following established protocols  but substituting 2,5-dihydroxybenzoic acid in the loading buffer by 6% trifluoroacetic acid (TFA) . Briefly, beads were washed first in 80% acetonitrile containing 0.1% TFA (washing buffer), then in 80% acetonitrile containing 6% TFA (binding buffer). Peptides were dissolved in binding buffer (200 μl/peptides of 2 mg matrix) and added to approximately 5 mg of loosely pelleted TiO2 beads. The mixture was incubated on a rotating wheel for 45 min. After centrifugation the supernatant was again incubated with fresh TiO2 beads as before. The beads were then washed twice with 200 μl of binding buffer followed by 2 × 200 μl of washing buffer. Finally the loaded beads were filled into C8 Stage Tips and phosphopeptides were eluted with 2 × 100 μl of a solution containing 40% acetonitrile and 15% ammonia. The eluate was vacuum-dried in an Eppendorf concentrator to ~20 μl and acidified with TFA. The peptides were purified on C18 Stage Tips  after dilution to 200 μl with 0.5% acetic acid.
Phosphopeptide-enriched samples were analysed on a Q Exactive high-performance Quadrupole Orbitrap mass spectrometer (Thermo Fisher Scientific, Bremen, Germany)  connected to an Easy-nLC 1000 nanoflow HPLC system (Thermo Fisher Scientific). Peptides were separated on a 50 cm column with an inner diameter of 75 μm filled with 1.8 μm C18 beads (Reprosil-AQ Pur, Dr. Maisch GmbH, Ammerbuch, Germany) prepared as described . Peptides were eluted with acetonitrile in 0.1% formic acid using a gradient of 5-30% acetonitrile in 95min, 30-60% in 30 min and 60-95% in 8 min at a flow of 250 nl/min and a column temperature of 50°C . Mass spectra were acquired in a data-dependent manner by automatically switching between MS and MS/MS in a top 10 approach. The resolution was 70000 for full spectra and 17500 (both at m/z 200) for HCD-derived fragments. The dynamic exclusion time was 30 sec.
To estimate the percentage of each protein in the total identified shell proteome, raw-files used in a previous study [17; method B] were re-analysed using the iBAQ (intensity-based absolute quantification) method  as implemented in MaxQuant version 18.104.22.168. Carbamidomethylation was set as fixed modification, variable modifications were acetyl (protein N-term), oxidation (M), pyro-Glu (Q,E) and phospho (STY). Maximal FDR for peptide spectral match, proteins and site was set to 0.01. The maximal peptide PEP was 0.01. Minimal peptide length was 7 amino acids. The minimal score for modified peptides was 50 and the minimal delta score for modified peptides was 17. A minimum of two sequence-unique peptides was required for identification, except for proteins that were identified with two or more unique peptides previously in separately analysed acid-soluble and acid-insoluble fractions . In very few cases new proteins were accepted with one unique peptide if this peptide occurred several times in different fractions and with an abundance of >0.01. The second peptide option was activated to enable identification of co-eluting peptides with very similar mass . Two miss-cleavages were allowed. The databases used were Lottia FilteredModels (Lotgi1_GeneModels_FilteredModels1_aa.fasta.gz) and Lottia AllModels (Lotgi1_GeneModels_AllModels_20070424_aa.fasta.gz)  downloaded from (http://jgi.doe.gov/), and a LOTGI subset of UniProtKB v2013_7 entries downloaded from http://www.uniprot.org/. These were supplemented with the reversed sequences and common contaminants automatically and used for quality control and FDR setting by MaxQuant. Phosphopeptides were accepted if they occurred at least twice or were confirmed by analysis of phosphopeptide-enriched samples.
Peptide mixtures for enrichment of phosphopeptides were prepared from three biological replicates prepared according to method B of . The acid-soluble and the acid-insoluble matrix of each biological replicate were used to prepare five technical replicates, resulting in 30 raw files that were evaluated together using MaxQuant [26, 28] version 22.214.171.124 with the same settings as above with a minimum of one sequence-unique phosphopeptide only, but sequenced at least twice and in different replicates. The decoy mode was set to reward in MaxQuant. Phosphopeptide spectra were validated using the MaxQuant Expert system, which provides additional fragment annotations not included in the routine annotation . Criteria were the assignment of major peaks, occurrence of uninterrupted y- or b-ion series of at least four consecutive amino acids, preferred cleavages N-terminal to proline bonds, the possible presence of a2/b2 ion pairs, the presence of immonium ions, and mass accuracy. In general only phosphopeptide identifications with a localization probability of ≥0.75 were accepted. However, in some cases adjacent residues, such as X(n)-S-S-X(n), could not be resolved with the fragmentation pattern of the respective phosphopeptides, making it impossible to exactly localize the phosphorylation site. As a result, lower localization probability scores were attributed to several residues. Such phosphopeptides were also accepted. Phospho sites were searched for known kinase motifs using Phosida Motif Matcher (http://www.phosida.com/) [30, 31] and PhosphoMotif Finder (http://www.hprd.org/PhosphoMotif_finder) . Most sequence-unique peptides were identified several times and site occupancy of phospho sites was estimated by comparing the number of unmodified to the number of phosphorylated forms of individual peptides.
Sequence similarity searches were performed with FASTA (http://www.ebi.ac.uk/Tools/sss/fasta/)  against current releases of the Uniprot Knowledgebase (UniProtKB). Other bioinformatics tools used were Clustal Omega for sequence alignments (http://www.ebi.ac.uk/Tools/msa/clustalo/) , InterPro (http://www.ebi.ac.uk/interpro)  for domain predictions, and SignalP 4.1 (http://www.cbs.dtu.dk/services/SignalP/)  for signal sequence prediction. Amino acid composition and theoretical pI were determined using the ProtParam tool provided by the Expasy server (http://web.expasy.org/protparam/) . Intrinsically disordered protein structure was predicted using IUPred (http://iupred.enzim.hu/)  and methods provided by the PredictProtein 2013 server (https://www.predictprotein.org/) [39, 40]. GO categories for subcellular location were derived from UniProt and Lottia database entries, signal sequence predictions and similarity to known proteins.
Results and discussion
Re-analysis and re-quantitation of Lottia shell proteins with MaxQuant-implemented iBAQ
In search of the reasons for apparent differences in previously published Lottia shell proteomes [17, 18] we noticed that database searches were done using the AllModels database in  while  used the FilteredModels database containing entries supported by EST sequences. Therefore we re-analyzed the raw-files produced previously for acid-soluble and acid-insoluble matrix prepared according to method B  (also used to identify phosphoproteins in the present report) using a combination of both databases and a subset of Uniprot containing Lottia + gigantea entries. Furthermore, to determine the approximate abundances of the identified proteins, the iBAQ (intensity-based absolute quantification) method  as implemented in more recent MaxQuant versions was enabled in this search. The previously used  emPAI method  belongs to the spectral count methods based on counting the number of identified unique parent ions per protein. In contrast, iBAQ and similar algorithms are called intensity-based because they calculate the sum of parent ion intensities of identified peptides per protein. In both types of methods, the numbers of theoretically possible peptides per protein for the protease used in sample preparation enter the equation to account for different protein lengths and distribution and frequency of cleavage sites. Comparison of the two different types of methods show a higher accuracy of the intensity-based methods, including iBAQ (for instance ), indicating that they should be given preference. Furthermore, the emPAI method in its original form  as we used it has become somewhat obsolete because of the recent progress in technology. For instance, modern mass spectrometers and the associated software provide high-confidence identifications of much longer peptides than previously possible. Consequently these long peptides are not included into emPAI calculations , but are included in iBAQ calculation.
Irrespective of the quantitation method accurate quantitation certainly also depends on the quality and completeness of the available sequence databases. Sequences not contained in the database can be neither identified by high-throughput mass spectrometry-based proteomic analysis nor quantitated. The same applies to sequences having no cleavage sites for the protease used in sample preparation. Faulty combination of sequences belonging to different proteins into one database entry or unnoticed faulty allocation of fragments of one protein to different database entries can all bias quantitation results. Finally, the abundance of proteins bearing many posttranslational modifications will be underestimated if the modification is not included in the analysis. In spite of these caveats we believe that routine quantitation of proteins in in-depth proteomic studies may be a useful tool to identify possible functionally important proteins for further study. We express the abundances as percentage of the identified proteome, obtained by normalizing the iBAQ intensities to the sum of all intensities. While the decision what to count as a major protein or a minor protein still remains arbitrary, it may now be more comprehensible to the reader and will possibly facilitate the decision of which proteins to choose for further studies.
The results of this new search (Additional file 1: Table S1) now includes all proteins published by  and contains 496 proteins/protein groups. Of these, 382 protein/protein group identifications were accepted (Additional file 2: Table S2) according to the rules stated in the Materials and Methods section. Twenty-three proteins were identified in the AllModels database only or in combination with the UniProt entries, including several very abundant ones (Table 1). Many groups contained several AllModels entries testifying to the high redundancy in this database. The corresponding MaxQuant table with protein data is contained in Additional file 1 (Additional file 1: Table S1), which also includes identifications not accepted. These were, for instance, identifications with only one single peptide with low scores or insufficient sequence coverage. The peptide data of the more than 4000 sequence-unique peptides, including peptide sequences and scores, are shown in Additional file 3 (Additional file 3: Table S3).
Quantitation with iBAQ showed that only 18 proteins/protein groups of a percentage of more than 1% of the identified proteome already constituted approximately 82% of the entire identified proteome (Table 1). This group comprised two very abundant (>1%) proteins not contained in the FilteredModels database, the Asp-, Gly-, Lys- and Ser-rich peroxidase-like protein-1 (DGLSP_LOTGI/Lotgi1|162078) and the Gly- and Ser-rich protein-1 (GSP1_LOTGI/Lotgi1|239214) . If a percentage of larger than 0.1% was chosen as a threshold, a total of 57 proteins (Table 1) amounted to approximately 98% of the total identified proteome. These included CCD2 (coiled-coil domain-containing protein 2; Lotgi1|234936), the perlwapin-like protein PWAP_LOTGI/Lotgi1|239121, and the EGF-like domain-containing protein 2 (ELDP2/Lotgi1|167423) , which were contained in the AllModels database but not in the FilteredModels database. Almost all proteins also identified in  were contained in this fraction of the proteome. Exceptions were the EF-hand calcium-binding domain-containing protein 1 and 2 (EFCB1/B3A0Q5, EFCB2/B3A0R9), and Threonine-rich protein LUSP-15/TRP/B3A0R4, which apparently belonged to the minor components of the identified proteome (Additional file 2: Table S2). However, we also identified several entries with a high similarity to EFCB2 based on sequence overlaps with sequence identities of 43-90% (Figure 1). Taken together, this protein family constituted slightly more than 0.1% of the identified proteome.
In agreement with a previous study  the major proteins comprised three peroxidase-like proteins (Table 1) including the most abundant protein Lotgi|162078/DGLSP_LOTGI. Peroxidases are a large and widespread family of enzymes catalysing redox reactions using a variety of electron donors and acceptors, including organic molecules. Peroxidases have been implicated previously in mollusc shell formation . Possibly they are responsible for the sclerotization of the periostracum [44–46], a proteinaceous layer confining the mantle cavity before the start of mineralization. As discussed previously  one may hypothesize that peroxidases function in stabilization of the newly secreted matrix by cross-linking some of its components. Another major protein, the abundance of which was noticed only using the AllModels database because the FilteredModels only contained a small fragment, was Lotgi1|166131. In this protein a long stretch of sequence with predicted disordered structure is followed by a predicted superoxide dismutase domain. Superoxide dismutases are a family of enzymes with widespread subcellular distribution that remove superoxide, a normal aerobic metabolite. One reaction product of superoxide dismutases is H2O2, a substrate of peroxidases.
In general, very little is known about the possible functions of shell matrix proteins, but in some cases similarities to known proteins and predicted domain structures may provide some clues for further studies. Predicted domain structures, GO terms for subcellular location, unusual amino acid composition features (amino acids representing ≥ 10% of the sequence) and theoretical isoelectric point for major identified Lotgi entries are included in Table 1. Extremely acidic matrix proteins (pI below 4.5) have found much interest in biomineralization research because of the possibility of direct interaction with the positively charged biomineral cations and have been hypothesized to act as nucleation sites involved in crystal formation . The group of 57 proteins with an abundance of >0.1 includes eight of such uncharacterized unusually acid proteins (Table 1) that may deserve to be studied in more detail. Many proteins isolated from biominerals contain sequence regions of intrinsically disordered structure, a feature that is implicated in protein-protein interaction and mineral binding [48, 49]. Table 1 includes several proteins with extended sequence regions of predicted disordered structure, such as the peroxidase-like protein-1 (DGLSP_LOTGI), the methionine-rich protein MRP_LOTGI, peroxidase_like 3 (PLSP3_LOTGI), and the uncharacterized proteins in Lotgi1|163637, 159331, 235610, 234884, 171084, 158316, 236690, and 239574. In two sequences both features, unusual acidity and predicted long-range structural disorder, coincide (Lotgi|159331, 171084). However, like all predicted features, predicted structural disorder needs experimental validation before far-reaching conclusions can be drawn.
Sometimes predicted domains strongly indicate involvement of the respective protein in biomineralization events. The putative carbonic anhydrases encoded in Lotgi|238082/CAH1 and Lotgi|239188/CAH2 and discussed previously  may be important for carbonate ion delivery. Also of special interest are proteins containing chitin-binding domains, such as Lotgi1|226726, 228264, and 239574. Many mollusc shells contain chitin-based extra-crystalline scaffolds and chitin-binding proteins may be important for organizing such scaffolds or may mediate interactions between chitin and the calcified matrix . However, for most proven and putative shell matrix proteins the function remains unknown at present.
Most of the identified proteins were only minor, or trace, components that may not have a function in biomineralization. However, it should be emphasised that there may be exceptions. For example, protein FAM20C (0.006% of the Lottia shell proteome; Additional file 2: Table S2), was recently identified as a Golgi apparatus kinase responsible for the phosphorylation of many secreted proteins, including proteins important for biomineralization [3, 4]. This kinase is also secreted to some degree, may be active in the extracellular space , and may enter biominerals in the company of its substrates. Of course this does not imply any function within the matrix but may explain its presence there. Other examples of the possible importance of trace components for biomineral formation are the sea urchin spicule proteins P58-A and P58-B. The extracellular domains of these predicted transmembrane proteins were detected as minor components in sea urchin spicule matrix  and both were subsequently shown by knock-down experiments to play an essential role in sea urchin larval skeletogenesis . Also among the trace components are proteins known to have a predominantly intracellular location, such as cytoskeletal components and cytosolic enzymes (Additional file 2: Table S2). We think that these proteins do not have a function in biomineralization. However, even trace components with a well-defined intracellular role, such as ubiquitin (now also known to occur in the extracellular space, however ) may have a true role in biomineralization, such as in the matrix of the Pinctada fucata shell prismatic layer . Finally it should be considered that the number of up-regulated genes, for instance after shell damage , is usually much larger than the number of major proteins identified in shell matrices. Possibly many of the trace proteins reflect regulatory or catalytic processes involved in the mineralization event at some point.
Because of the low number of different proteins in the shell matrix and because the HCD (higher energy collisional dissociation) fragmentation method used in the previous shell proteome analysis  enables phosphopeptide analysis at high resolution and mass accuracy in the LTQ Orbitrap Velos [56, 57] without the need for neutral loss-dependent MS3 or multistage activation  used previously with CID fragmentation, we included phosphorylation as a variable modification in this re-analysis. The results indicated (Additional file 1: Table S1) that several major and a few minor proteins were phosphorylated to a variable extent. These preliminary results were validated by analysis of phosphopeptide-enriched samples of shell matrix proteins (Additional file 4: Table S4). Thirteen of these were confirmed by analyzing phosphopeptide-enriched fractions. Three more were identified only in phosphopeptide-enriched samples (Additional file 4: Table S4), yielding a total of 20 phosphoproteins. The MaxQuant phosphopeptide output table is shown in Additional file 5: Table S5. Nine major proteins with a percentage of more than 1% of the identified protein and five with a percentage between 0.1% and 1% (Table 1) were identified as phosphoproteins. Simultaneous determination of phosphorylated and non-phosphorylated versions of the phosphopeptides in the general survey without prior enrichment enabled an approximate estimation of site occupancy (Additional file 4: Table S4), which was very low in most cases. Site occupancy in the group of major proteins was highest in GEPRP/B3A0P5 and the uncharacterized protein of Lotgi1|154020. While GEPRP contained only two closely spaced phosphorylation sites, Lotgi1|154020 contained four sites in three peptides (Additional file 4: Table S4). This high site-occupancy strongly indicates that phosphorylation of these proteins may be functionally important. Three proteins, DGLSP/B3A0P1, PLSP2/B3A0P3 and CCD1/B3A0Q3 yielded more than three phosphopeptides with variable site-occupancy (Additional file 4: Table S4). Of these, Coiled-coil domain-containing protein 1 (CCD1)/B3A0Q3 was already shown to be extremely acidic previously , a feature that is enhanced by phosphorylation. This may be taken as a further indication of a very important, but as yet not understood, role of this protein in Lottia shell assembly.
Taking into account the number of phosphorylation sites and site occupancy, CCD1/B3A0Q3 may be considered as the major phosphoprotein of the Lottia gigantea shell matrix. We want to point out, however, that densely phosphorylated proteins with highly repetitive sequences, such as dentin phosphoryn, which contains almost exclusively aspartic acid, asparagine and phosphoserine , require special techniques to be identified and may be missing from our analysis.
A search for sequences including phospho sites for known kinase motifs indicated that approximately one third (16 of 46) of the unique S/T phospho sites comply with the Fam20C recognition site S-x-E or related motifs (S/T-x-E/D/pS/pT) [3, 4]. This percentage is in good agreement with the approximately 24% of human secreted phosphoproteins modified at the serine of the canonical FAM20C motif S-x-E . However, much less is known about phosphorylation in invertebrate secreted proteins and the kinases involved. Therefore it is unknown whether these recognition sites are conserved between vertebrates and invertebrates. Five of the sites identified are in agreement with the typical casein kinase 2 motif S-x-x-E also modified in the mammalian mineralization-inhibiting protein osteopontin, and ten sites comply with the casein kinase 1 motif (D/E)n-x-x-S/T  indicating that secreted or membrane-bound kinases with casein-kinase-like activity are involved. Evidence for such kinases is summarized in [5, 6].
Our approach to proteomes of invertebrate biominerals consists of washing the biominerals with hypochlorite in a less stringent way than proposed recently  to preserve extra-crystalline matrix components, and to identify as many proteins as possible after in-gel digestion of slices of the entire gel  irrespective of staining intensity, or after in-solution digestion using filter-aided sample preparation (FASP) . Included in protein identification is quantitation, which was done using exponentially modified protein abundance index (emPAI)  previously , but was recently superseded  in favor of the more accurate automated iBAQ method  as implemented in more recent versions of MaxQuant. We believe that this approach is well suited to identify candidates for functional matrix proteins, most likely found among the most abundant components, while retaining all of the information about trace components, irrespective of whether these may have a function in biomineralization or not, and irrespective of whether they are intra-crystalline or belong to the extra-crystalline matrix. Proteins predominantly located intracellularly, such as cytoskeletal components, ribosomal proteins, proteasome subunits or cytoplasmic enzymes, belong to the minor components of the Lottia shell proteome (Additional file 2: Table S2) constituting only an insignificant fraction of the total. However, the identification and quantitation of such proteins may also depend in some way on the biomineral examined, the instrumentation used, and the washing procedures applied to the shell and we agree with others [59, 61] that the mere presence of such proteins in the matrix sample does certainly not imply a function.The group of major proteins also contains several phosphoproteins. Those yielding high-occupancy phospho sites and/or many phosphorylated sequence-unique peptides were already identified without prior phosphopeptide enrichment in a general survey. However, subtleties such as the occurrence of different sites with high localization probability within one peptide sequence (Figure 2) are more likely detected with the higher copy numbers usually provided by phosphopeptide-enriched samples. Nevertheless, inclusion of phosphorylation among the variable modifications in general studies of low complexity proteomes may give an overview of what to expect with phosphopeptide-enriched samples and may provide a rough estimate of phospho site occupancies.
Exponentially modified protein abundance index
False discovery rate
Higher-energy collision-induced decomposition
Intensity-based absolute quantification
Posterior error probability
Veis A, Sfeir C, Wu CB: Phosphorylation of the proteins of the extracellular matrix of mineralized tissues by casein kinase-like activity. Crit Rev Oral Biol Med 1997, 8: 360–379.
George A, Veis A: Phosphorylated proteins and control over apatite nucleation, crystal growth, and inhibition. Chem Rev 2008, 108: 4670–4693.
Tagliabracci VS, Engel JL, Wen J, Wiley SE, Worby CA, Kinch LN, Xiao J, Grishin NV, Dixon JE: Secreted kinase phosphorylates extracellular proteins that regulate biomineralization. Science 2012, 336: 1150–1153.
Ishikawa HO, Xu A, Ogura E, Manning G, Irvine KD: The Raine syndrome protein FAM20C is a Golgi kinase that phosphorylates biomineralization proteins. PLoS One 2012, 7: e42988.
Tagliabracci VS, Pinna LA, Dixon JE: Secreted protein kinases. Trends Biochem Sci 2013, 38: 121–130.
Yalak G, Vogel V: Extracellular phosphorylation and phosphorylated proteins: not just curiosities but physiologically important. Sci Signal 2012, 5: re7.
Sodek J, Gans B, McKee MD: Osteopontin. Crit Rev Oral Biol Med 2000, 1: 279–303.
Gimba ER, Tilli TM: Human osteopontin splicing isoforms: known roles, potential clinical applications and activated signaling pathways. Cancer Lett 2013, 331: 11–17.
Staines AK, MacRae VE, Farquharson C: The importance of the SIBLING family of proteins on skeletal mineralization and bone remodeling. J Endocrinol 2012, 214: 241–255.
Hunter GK, Kyle CL, Goldberg HA: Modulation of crystal formation by bone phosphoproteins: structural specificity of the osteopontin-mediated inhibition of hydroxyapatite formation. Biochem J 1994, 300: 723–728.
Jono S, Peinado C, Giachelli CM: Phosphorylation of osteopontin is required for inhibition of vascular smooth muscle cell calcification. J Biol Chem 2000, 275: 20197–20203.
He G, Ramachandran A, Dahl T, George S, Schultz D, Cookson D, Veis A, George A: Phosphorylation of phosphophoryn is crucial for its function as a mediator of biomineralization. J Biol Chem 2005, 280: 33109–33114.
Deshpande AS, Fang P-A, Zhang X, Jayaraman T, Sfeir C, Beniash E: Primary structure and phosphorylation of dentin matrix protein 1 (DMP1) and dentin phosphoryn (DPP) uniquely determine their role in biomineralization. Biomacromolecules 2011, 12: 2933–2945.
Hecker A, Testenière O, Marin F, Luquet G: Phosphorylation ofserine residues is fundamental for the calcium-binding ability of orchestin, a soluble matrix protein from crustacean calcium storage structures. FEBS Lett 2003, 535: 49–54.
Mann K, Olsen JV, Maček B, Gnad F, Mann M: Phosphoproteins of the chicken eggshell calcified layer. Proteomics 2007, 7: 106–115.
Mann K, Poustka AJ, Mann M: Phosphoproteomes of Strongylocentrotus purpuratus shell and tooth matrix: identification of a major acidic sea urchin tooth phosphoprotein, phosphodontin. Proteome Sci 2010, 8: 6.
Mann K, Edsinger-Gonzales E, Mann M: In-depth proteomic analysis of a mollusk shell: acid-soluble and acid-insoluble matrix of the limpet Lottia gigantea . Proteome Sci 2012, 10: 28.
Marie B, Jackson DJ, Ramos-Silva P, Zanella-Cleon I, Guichard N, Marin F: The shell-forming proteome of Lottia gigantea reveals both deep conservation and lineage-specific novelties. FEBS J 2013, 280: 214–232.
Schwanhäusser B, Busse D, Li N, Dittmar G, Schuchhardt J, Wolf J, Chen W, Selbach M: Global quantification of mammalian gene expression control. Nature 2011, 473: 337–342.
Wisniewski JR, Zougman A, Nagaraj N, Mann M: Universal sample preparation method for proteome analysis. Nat Methods 2009, 6: 359–362.
Larsen MR, Thingholm TE, Jensen ON, Roepstorff P, Jorgensen TJD: Highly selective enrichment of phosphorylated peptides from peptide mixtures using titanium dioxide microcolumns. Mol Cell Proteomics 2005, 4: 873–886.
Zhou H, Low TY, Hennrich ML, Van der Toorn H, Schwendt T, Zou H, Mohammed S, Heck AJR: Enhancing the identification of phosphopeptides from putative basophilic kinase substrates using Ti (IV) based IMAC enrichment. Mol Cell Proteomics 2011, 10: 1–14.
Rappsilber J, Mann M, Ishihama Y: Protocol for micro-purification, enrichment, pre-fractionation and storage of peptides for proteomics using StageTips. Nat Protoc 2007, 2: 1896–1906.
Michalski A, Damoc E, Hauschild J-P, Lange O, Wieghaus A, Makarov A, Nagaraj N, Cox J, Mann M, Horning S: Mass spectrometry-based proteomics using Q Exactive, a high-performance benchtop quadrupole orbitrap mass spectrometer. Mol Cell Proteomics 2011, 10: 1–11.
Thakur SS, Geiger T, Chatterjee B, Bandilla P, Fröhlich F, Cox J, Mann M: Deep and highly sensitive proteome coverage by LC-MS/MS without prefractionation. Mol Cell Proteomics 2011, 10: 1–9.
Cox J, Neuhauser N, Michalski A, Scheltema RA, Olsen JV, Mann M: Andromeda – a peptide search engine integrated into the MaxQuant environment. J Proteome Res 2011, 10: 1794–1805.
Simakov O, Marletaz F, Cho SJ, Edsinger-Gonzales E, Havlak P, Hellsten U, Kuo DH, Larsson T, Lv J, Arendt D, Savage R, Osoegawa K, de Jong P, Grimwood J, Chapman JA, Shapiro H, Aerts A, Otillar RP, Terry AY, Boore JL, Grigoriev IV, Lindberg DR, Seaver EC, Weisblat DA, Putnam NH, Rokhsar DS: Insights into bilaterian evolution from three spiralian genomes. Nature 2013, 493: 526–531.
Cox J, Mann M: MaxQuant enables high peptide identification rates, individualized ppb-range mass accuracies and proteome-wide protein quantification. Nat Biotechnol 2009, 26: 1367–1372.
Neuhauser N, Michalski A, Cox J, Mann M: Expert system for computer-assisted annotation of MS/MS spectra. Mol Cell Proteom 2012, 11: 1500–1509.
Gnad F, Ren S, Cox J, Olsen JV, Macek B, Oroshi M, Mann M: PHOSIDA (phosphorylation site database): management, structural and evolutionary investigation and prediction of phospho sites. Genome Biol 2007, 8: R250.
Gnad F, Gunawardena J, Mann M: PHOSIDA 2011: the posttranslational modification database. Nuc Acids Res 2011,39(supplement1):D253–260.
Amanchy R, Periaswamy B, Mathivanan S, Reddy R, Tattikota SG, Pandey A: A compendium of curated phosphorylation-based substrate and binding motifs. Nat Biotechnol 2007, 25: 285–286.
Goujon M, McWilliam H, Li W, Valentin F, Squizzato S, Paern J, Lopez R: A new bioinformatics analysis tools framework at EMBL-EBI (2010). Nucleic Acids Res 2010,38(Suppl):W695–9.
Sievers F, Wilm A, Dineen DG, Gibson TJ, Karplus K, Li W, Lopez R, McWilliam H, Remmert M, Söding J, Thompson JD, Higgins DG: Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol Syst Biol 2011, 7: 539.
Hunter S, Jones P, Mitchell A, Apweiler R, Attwood TK, Bateman A, Bernard T, Binns D, Bork P, Burge S, de Castro E, Coggill P, Corbett M, Das U, Daugherty L, Duquenne L, Finn RD, Fraser M, Gough J, Haft D, Hulo N, Kahn D, Kelly E, Letunic I, Lonsdale D, Lopez R, Madera M, Maslen J, McAnulla C, McMenamin C, et al.: InterPro in 2011: new developments in the family and domain prediction database. Nucleic Acids Res 2011, 40: D306-D312.
Petersen TN, Brunak S, von Heinje G, Nielsen H: SignalP 4.0: discriminating signal peptides from transmembrane regions. Nat Methods 2011, 8: 785–786.
Gasteiger E, Hoogland C, Gattiker A, Duvaud S, Wilkins MR, Appel RD, Bairoch A: Protein Identification and Analysis Tools on the ExPASy Serve. In The Proteomics Protocols Handbook. Edited by: John M. Walker: Humana Press; 2005:571–607.
Dosztányi Z, Csizmók V, Tompa P, Simon I: IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content. Bioinformatics 2005, 21: 3433–3434.
Rost B, Yachdav G, Liu J: The PredictProtein server. Nucl Acids Res 2004, 32: W321–326.
Schlessinger A, Punta M, Yachdav G, Kajan L, Rost B: Improved disorder prediction by combination of orthogonal approaches. PLoS One 2009, 4: e4433-e4433.
Ishihama Y, Oda Y, Tabata T, Sato T, Nagasu T, Rappsilber J, Mann M: Exponentially modified protein abundance index (emPAI) for estimation of absolute protein amount in proteomics by the number of sequenced peptides per protein. Mol Cell Proteomics 2005, 4: 1265–1272.
Ahrne E, Molzahn L, Glatter T, Schmidt A: Critical assessment of proteome-wide label-free absolute abundance estimation strategies. Proteomics 2013, 13: 2567–2578.
Timmermans LPM: Studies on shell formation in molluscs. Netherlands J Zool 1969, 19: 417–523.
Waite JH: Evidence for the mode of sclerotization in a molluscan periostracum. Comp Biochem Physiol 1977, 58B: 157–162.
Marxen JC, Witten PE, Fincke D, Reelsen O, Rezgaoui M, Becker W: A light- and electron microscopic study of enzymes in the embryonic shell-forming tissue of the freshwater snail, Biophalaria glabrata. Invertebrate Biol 2003, 122: 313–325.
Hohagen J, Jackson DJ: An ancient process in a modern mollusc: early development of the shell in Lymnea stagnalis . BMC Dev Biol 2013, 13: 27.
Marin F, Luquet G: Unusually Acidic Proteins In Biomineralization. In Handbook of Biomineralization. volume 1 edition. Edited by: Bäuerlein E. Weinheim: Wiley-VCH Verlag; 2007:273–290.
Evans JS: Aragonite-associated biomineralization proteins are disordered and contain interactive motifs. Bioinformatics 2012, 28: 3182–3185.
Wojtas M, Dobryszycki P, Ozyhar A: Intrinsically Disordered Proteins in Biomineralization. In Advanced Topics in Biomineralization. Edited by: Jong S. Intech; 2012. Chapter 1 ( ) http://www.intechopen.com/books/advanced-topics-in-biomineralization
Furuhashi T, Schwarzinger C, Miksik I, Smrz M, Beran A: Molluscan shell evolution with review of shell calcification hypothesis. Comp Biochem Physiol 2009, 154B: 351–371.
Mann K, Wilt FH, Poustka AJ: Proteomic analysis of sea urchin ( Strongylocentrotus purpuratus ) spicule matrix. Proteome Sci 2010, 8: 33.
Adomako-Ankomah A, Ettensohn CA: P58-A and P58-B: novel proteins that mediate skeletogenesis in the sea urchin embryo. Dev Biol 2011, 353: 81–93.
Saini V, Marchese A, Majetschak M: CXC chemokine receptor 4 is a cell surface receptor for extracellular ubiquitin. J Biol Chem 2010, 285: 15566–15576.
Fang D, Pan C, Lin H, Lin Y, Xu G, Zhang G, Wang H, Xie L, Zhang R: Ubiquitylation functions in the calcium carbonate biomineralization in the extracellular matrix. Plos One 2012, 7: e35715.
Wang X, Li L, Zhu Y, Du Y, Song X, Chen Y, Huang R, Que H, Zhang G: Oyster shell proteins originate from multiple organs and their probable transport pathway to the shell formation front. PLoS One 2013, 8: e66522.
Nagaraj N, D’Souza RCJ, Cox J, Olsen JV, Mann M: Feasibility of large-scale phosphoproteomics with higher energy collisional dissociation fragmentation. J Proteome Res 2010, 9: 6786–6794.
Nagaraj N, D’Souza RCJ, Cox J, Olsen JV, Mann M: Correction to feasibility of large-scale phosphoproteomics with higher energy collisional dissociation fragmentation. J Proteome Res 2012, 11: 3506–3508.
Maček B, Mann M, Olsen JV: Global and site-specific quantitative phosphoproteomics: princples and applications. Annu Rev Pharmacol Toxicol 2009, 49: 199–221.
Ramos-Silva P, Marin F, Kaandorp J, Marie B: Biomineralization toolkit: the importance of sample cleaning prior to the characterization of biomineral proteomes. Proc Natl Acad Sci U S A 2013, 110: E2144-E2146.
Mann K, Mann M: The proteome of the calcified layer organic matrix of turkey (Meleagris gallopavo) eggshell. Proteome Sci 2013, 11: 40.
Marie B, Ramos-Silva P, Marin F, Marie A: Proteomics of CaCO3 biomineral-associated proteins: how to properly address their analysis. Proteomics 2013, 13: 3109–3116.
We gratefully acknowledge the support of Matthias Mann (MPI of Biochemistry, Martinsried), of this study. We also thank Fred H. Wilt, Department of Molecular and Cell Biology, University of California, Berkeley, for drawing KM’s attention to the Lottia genome project and for bringing KM and EE into contact. Furthermore we thank Gaby Sowa (MPI) for preparing the capillary columns and Korbinian Mayr and Igor Paron (both MPI) for keeping the mass spectrometers in excellent condition.
The authors declare that they have no competing interests.
KM conceived the study, performed sample preparation and data acquisition. EE collected and mechanically cleaned Lottia shells and helped with database search and annotation. All authors took part in the design of the study and were critically involved in manuscript drafting. Both authors read and approved the final manuscript.