Hen's egg white has been the subject of intensive chemical, biochemical and food technological research for many decades, because of its importance in human nutrition, its importance as a source of easily accessible model proteins, and its potential use in biotechnological processes. Recently the arsenal of tools used to study the protein components of egg white has been complemented by mass spectrometry-based proteomic technologies. Application of these fast and sensitive methods has already enabled the identification of a large number of new egg white proteins. Recent technological advances may be expected to further expand the egg white protein inventory.
Using a dual pressure linear ion trap Orbitrap instrument, the LTQ Orbitrap Velos, in conjunction with data analysis in the MaxQuant software package, we identified 158 proteins in chicken egg white with two or more sequence unique peptides. This group of proteins identified with very high confidence included 79 proteins identified in egg white for the first time. In addition, 44 proteins were identified tentatively.
Our results, apart from identifying many new egg white components, indicate that current mass spectrometry technology is sufficiently advanced to permit direct identification of minor components of proteomes dominated by a few major proteins without resorting to indirect techniques, such as chromatographic depletion or peptide library binding, which change the composition of the proteome.
The avian egg white functions as a shock-absorber, keeps the yolk in place, constitutes an antimicrobial barrier, and provides water, protein and other nutrients to the developing embryo. Besides these biological roles it is an inexpensive source of high quality protein for food industries, contains proteins of pharmaceutical interest, and proteins that have found widespread use in biomedical research and protein chemistry [1–6]. Therefore, it is no surprise that egg white has been the target of proteomic studies previously. Raikos et al.  used 2D electrophoresis to separate the proteins and MALDI-TOF-based peptide mass fingerprinting to analyze the spots. Seven proteins were identified. 2D electrophoresis, peptide mass fingerprinting and LC-MS/MS using a quadrupole-TOF mass spectrometer were used to identify sixteen proteins in a more advanced study . We have reported the high confidence identification of 78 proteins in egg white using a workflow consisting of SDS-PAGE to separate proteins, coupled to LC-MS/MS and MS3 with an LTQ-FT mass spectrometer . The use of combinatorial hexapeptide libraries  in conjunction with LC-ESI-IT-MS/MS allowed the identification of 148 egg white proteins, demonstrating the power of this novel technology to detect minor components even in samples dominated by a few major proteins . Bead-coupled peptide libraries are thought to "equalize" the proteome by providing similar numbers of binding sites to each of the different proteins contained in a proteome. However, it was shown recently that, in contrast to the previously proposed mode of action, the beneficial effect of the peptide beads does not appear to be mediated by specific interaction but is instead dominated by simple hydrophobic effects .
Samples, such as egg white, where ovalbumin, ovotransferrin and ovomucoid make up approximately 75% of the total protein, are traditionally difficult to analyze in depth by mass spectrometry, because the peptides of these few proteins tend to dominate the full mass spectra and are selected for fragmentation by MS/MS over and over again. This difficulty has been addressed by the above-mentioned peptide ligand library bead or hydrophobic bead technology [10–12]. However, disadvantages of the peptide library technology include that it is only amenable to soluble proteins and that the composition of the proteome is modified in an unknown and unpredictable way, which makes it impossible to determine the absolute quantity of the proteins. Since the publication of those studies, new developments in instrumentation and peptide identification software occurred, which raised the possibility that in-depth investigation of the egg white proteome would not have to rely on enrichment technologies any more. In the present report we used a novel dual pressure linear ion trap instrument, the LTQ Orbitrap Velos . This new generation of mass spectrometers has increased sensitivity and scan speed as compared to the LTQ-FT used in our previous study . The LTQ Orbitrap Velos is fast enough to isolate and fragment ten or more peaks simultaneously with the acquisition of one high resolution mass full scan spectrum. For evaluation of spectra and database searches we used the MaxQuant software, which is particularly suited for the use of high-resolution MS data and yields very high mass accuracy and peptide identification rates [14–16].
Materials and methods
Preparation of peptides
Proteins were separated by PAGE with pre-cast 4-12% Novex Bis-Tris gels in MES buffer, using reagents and protocols supplied by the manufacturer (Invitrogen, Carlsbad, CA). The kit sample buffer was modified by adding SDS and β-mercaptoethanol to a final concentration of 5% and 2%, respectively, and the sample was suspended in 40 μl sample buffer/100 μg of egg white protein and boiled for 5 min. Gels were stained with colloidal Coomassie (Invitrogen) after electrophoresis. Three lanes loaded with 100 μg of protein were used in each of three separate experiments. The gels were cut into 24 slices for in-gel digestion with trypsin  and the peptides were cleaned with Stage Tips  before mass spectrometric analysis.
LC-MS and data analysis
Peptide mixtures were analyzed by on-line nanoflow liquid chromatography using the EASY-nLC system (Proxeon Biosystems, Odense, Denmark, now part of Thermo Fisher Scientific) with 15cm capillary columns of an internal diameter of 75 μm filled with 3 μm Reprosil-Pur C18-AQ resin (Dr. Maisch GmbH, Ammerbuch-Entringen, Germany). The gradient consisted of 5-30% acetonitrile in 0.5% acetic acid at a flow rate of 250nl/min for 85min, 30-60% acetonitrile in 0.5% acetic acid at a flow rate of 250nl/min and 60-80% acetonitrile in 0.5% acetic acid at a flow rate of 250nl/min for 7min. The eluate was electrosprayed into an LTQ Orbitrap Velos (Thermo Fisher Scientific, Bremen, Germany) through a Proxeon nanoelectrospray ion source. The LTQ Orbitrap Velos was operated in a CID top 10 mode essentially as described . The resolution was 30,000 (1 experimental data set) and 60,000 (2 experimental data sets) for the Orbitrap whereas fragment spectra were read out at low resolution in the LTQ. Ion trap and orbitrap maximal injection times were set to 25ms and 500ms, respectively. The ion target values were 5000 for the ion trap and 1000000 for the orbitrap. Raw files were processed using version 184.108.40.206 of MaxQuant (http://www.maxquant.org/). For protein identification the ipi.CHICK protein database v3.65 (http://www.ebi.ac.uk/IPI/IPIchicken.html) was combined with the reversed sequences and sequences of widespread contaminants, such as human keratins. Carbamidomethylation was set as fixed modification. Variable modifications were oxidation (M), N-acetyl (protein) and pyro-Glu/Gln (N-term). Initial peptide mass tolerance was set to 7ppm and fragment mass tolerance was set to 0.5 Da. Two missed cleavages were allowed and the minimal length required for a peptide was seven amino acids. Two unique peptides were required for high-confidence protein identifications. These could also be derived from different experimental data sets. The peptide and protein false discovery rates (FDR) were set to 0.01. The maximal posterior error probability (PEP), which is the probability of each peptide to be a false hit considering identification score and peptide length, was set to 0.01. Proteins identified in two of three experimental data sets were accepted. Tentative identifications with only one unique peptide, or two (or more) unique peptides in only one experimental data set, were manually validated considering the assignment of major peaks, occurrence of uninterrupted y- or b-ion series of at least 3 consecutive amino acids, preferred cleavages N-terminal to proline bonds, the possible presence of a2/b2 ion pairs and mass accuracy. The ProteinProspector MS-Product program (http://prospector.ucsf.edu/) was used to calculate the theoretical masses of fragments of identified peptides for manual validation. The exponentially modified protein abundance index (emPAI) provides an estimate of the absolute abundance of a protein from the ratio of observed to observable peptides  and was used to differentiate between major and minor proteins. The emPAI calculation considered the preset modifications, miss-cleavages and different charge states. Usually only unique peptides were counted, but in the case of substantial overlap, i.e. almost identical proteins, these were grouped together and the emPAI was calculated for the protein with highest sequence coverage.
Results and discussion
Egg white proteins were separated by PAGE and gels were cut into 24 sections for in-gel digestion (Figure 1) followed by mass spectrometric analysis of the resulting peptides on a high resolution instrument with fast sequencing speed. Three repetitions of the experiment resulted in seventy-two raw-files that yielded a total of approximately 61,500 peptides identified and accepted with a peptide posterior error probability (PEP) of <0.01 and a preset false discovery rate (FDR) of 0.01. Of these, 1,373 peptides were sequence-unique. The average absolute mass deviation was 1.2ppm. By searching of a chicken protein sequence database and by accepting only protein identifications with two sequence-unique peptides occurring in at least two of three experimental data sets, 158 proteins were identified (Additional file 1: Egg white proteins identified with two or more unique peptides). If approximately equal conditions are used between the present study and the peptide library-based study  by also considering proteins identified with single peptides occurring in at least two experimental data sets, or proteins identified by two or more unique peptides in only one experimental data set, 44 more proteins can be added to the list (Additional file 2: Tentatively identified egg white proteins), resulting in a total of 202 possibly identifications. Additional protein data, such as UniProt and RefSeq accession codes, number of identified peptides, sequence coverage, and protein PEP scores for accepted proteins (without contaminants) are provided in Additional file 3: Protein data. These results compared favorably with those obtained with peptide ligand library beads , where 68 proteins were identified with two or more sequence-unique peptides and a total of 148 proteins were obtained by accepting unique single peptide hits from different experiments (Figure 2). Furthermore, our study conservatively groups proteins with very similar sequences together and counts them as one "protein group", even when unique peptides pointed at the presence of isoforms or very similar proteins possibly encoded in different genes. Thus, the number of identified proteins is probably higher. A representative example is ovotransferrin, which seemed to represent a mixture of several forms containing many shared and a few unique peptides. Unique peptide data for accepted proteins (without contaminants), such as sequences, PEP scores, and distribution among gel sections are shown in Additional file 4: Peptide data.
Several previously identified proteins  were not identified immediately in the new egg white proteome. However, searching the new database version, IPIchick v3.65, with peptide sequences responsible for the previous identification of these proteins indicated that this was in many cases due to changes in the database. Thus, for instance, the ovosecretoglobulin sequence was no longer joined to a channel protein sequence in IPI00575434 but appeared with a new accession number, IPI00847051. Other proteins changed name. Thus, chondrogenesis-associated lipocalin (IPI00600353) is now lipocalin-type prostaglandin synthase D. The only proteins that could not be identified again in the present study were HMG-1 (IPI00595982), a hypothetical protein (IPI00597019), histone H1 (IPI00597019), 60S ribosomal protein L27 (IPI00577674) and poly(ADP-ribosyl) polymerase 1 (IPI00588387). The first two proteins were previously identified predominantly (HGM-1) or exclusively (Hypothetical protein) by in-solution tryptic cleavage, which was not performed in the present study. Three of these proteins, HMG-1, histone H1, and poly(ADP-ribosyl) polymerase were, however, confirmed in a recent study . Therefore, the reason for their absence in the present study is not clear, but as these proteins are unlikely to play functional roles in egg white, their inclusion in egg white preparations may vary. Keratins were excluded from our results because they usually shared all or most peptides with common contaminants. Only few of the new egg white proteins identified using peptide ligand library beads  were also detected in the present study. These were nine proteins in the group of identifications with >2 unique peptides (Additional file 1: Egg white proteins identified with two or more unique peptides) and four among the tentatively identified proteins (Additional file 2: Tentatively identified egg white proteins).
Reassuringly, only two new protein identifications were contained among the 30 most abundant egg white proteins (Additional file 1: Egg white proteins identified with two or more unique peptides). This group of proteins contained 79 proteins that were not identified as egg white components previously. The new egg white proteins included several typical major yolk residents, such as apovitellinin-I, vitellogenin-1 to -3 and apolipoprotein B. These proteins are synthesized in the liver, carried to the ovary via the blood circulation, taken up by oocytes via receptor-mediated processes, and incorporated into the globular fraction of egg yolk . Because the egg yolk was not damaged during mechanical separation of egg white and yolk, these proteins do not seem to be simple contaminants. Rather, residual protein not taken up by the egg cell may be liberated from the ovary together with the egg and migrate with the egg into the oviduct, mixing with egg white proteins secreted in the magnum section. In line with this suggestion, apovitellenin-I and vitellogenins have also been identified in the eggshell organic matrix . This indicates that the oviduct fluid in the eggshell gland still contained these proteins. A few representative peptide fragmentation spectra for some of these proteins are shown in Figure 3. However, many of the new proteins present at low abundance are proteins normally found in intracellular compartments (Additional file 1: Egg white proteins identified with two or more unique peptides; Additional file 2: Tentatively identified egg white proteins). Golgi and ER proteins may have reached the oviduct fluid as by-products of the secretion of major egg white proteins. Other intracellular proteins may have come from damaged, leaky cells of the epithelium lining the oviduct, or from organelles, such as lysosomes, which occur in egg white . Analysis of previously known subcellular locations of proteins identified in egg white shows a decrease in secreted proteins from approximately 64% in the whole proteome to 37% among the new proteins and 18% in tentative identifications (Figure 4). This is accompanied by a similar increase in intracellular proteins, indicating that we have now reached a depth of proteome characterization beyond which it may become difficult to identify functional egg white components. Therefore, minor specific egg white proteins of interest, such as MMP-2, may preferentially be enriched by specific methods before analysis . However, the search for minor components in egg white remains of importance, because very low-abundance proteins, such as bone morphogenetic protein 1 (Additional file 1: Egg white proteins identified with two or more unique peptides) may have a biological role, for instance in early embryonic development.
Our results indicate that current state of the art mass spectrometry technology is sufficiently advanced to permit direct mining of minor components of proteomes dominated by a few major proteins without the necessity to resort to broad specificity protein enrichment techniques, such as peptide ligand library tools, that change the proteome and render absolute quantification impossible. In addition we have significantly expanded the previously known egg white protein inventory.
False Discovery Rate
Matrix-assisted Laser Desorption Ionization
2-(N-Morpholino) ethanesulfonic acid
Polyacrylamide Gel Electrophoresis
Posterior Error probability
Sodium Dodecyl Sulphate
Stevens L: Egg white proteins.Comp Biochem Physiol B 1991, 100: 1–9. 10.1016/0305-0491(91)90076-P
Raikos V, Hansen R, Campbell L, Euston SR: Separation and identification of hen egg protein isoforms using SDS-PAGE and 2D gel electrophoresis with MALDI-TOF mass spectrometry.Food Chem 2006, 99: 702–710. 10.1016/j.foodchem.2005.08.047
D'Ambrosio C, Arena S, Scaloni A, Guerrier L, Boschetti E, Mendieta ME, Citterio A, Righetti PG: Exploring the chicken egg white proteome with combinatorial peptide ligand libraries.J Proteome Res 2008, 7: 3461–3474.
Olsen JV, Schwartz JC, Griep-Raming J, Nielsen ML, Damoc E, Denisov E, Lange O, Remes P, Taylor D, Splendore M, Wouters ER, Senko M, Makarov A, Mann M, Horning S: A dual pressure linear ion trap-Orbitrap instrument with very high sequencing speed.Mol Cell Proteomics 2009, 8: 2759–2769. 10.1074/mcp.M900375-MCP200
Cox J, Mann M: MaxQuant enables high peptide identification rates, individualized ppb-range mass accuracies and proteome-wide protein quantification.Nature Biotechnol 2009, 26: 1367–1372. 10.1038/nbt.1511
Cox J, Mann M: Computational Principles of determining and improving mass precision and accuracy for proteome measurements in an orbitrap.J Am Soc Mass Spectrom 2009, 20: 1477–1485. 10.1016/j.jasms.2009.05.007
Rappsilber J, Mann M, Ishihama Y: Protocol for micro-purification, enrichment, pre-fractionation and storage of peptides for proteomics using StageTips.Nature Protocols 2007, 2: 1896–1906. 10.1038/nprot.2007.261
Ishihama Y, Oda Y, Tabata T, Sato T, Nagasu T, Rappsilber J, Mann M: Exponentially modified protein abundance index (emPAI) for estimation of absolute protein amount in proteomics by the number of sequenced peptides per protein.Mol Cell Proteom 2005, 4: 1265–1272. 10.1074/mcp.M500061-MCP200
Réhault-Godbert S, Gautron J, Labas V, Belghazi M, Nys Y: Identification and characterization of the precursor of chicken matrix metalloproteases 2 (pro-MMP-2) in hen egg.J Agric Food Chem 2008, 56: 6294–6303.
The authors declare that they have no competing interests.
KM conceived the study, performed sample preparation and data acquisition. MM supplied methodological expertise. Both authors took part in the design of the study and were critically involved in manuscript drafting. All authors read and approved the final manuscript.
Additional file 2: Tentatively identified egg white proteins. Docx-file showing a list of proteins identified with one unique peptide in two of three experimental sets and proteins identified with 2 or more peptides in only one experimental set. (DOCX 24 KB)
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.