Sample prep for proteomics of breast cancer: proteomics and gene ontology reveal dramatic differences in protein solubilization preferences of radioimmunoprecipitation assay and urea lysis buffers

Background An important step in the proteomics of solid tumors, including breast cancer, consists of efficiently extracting most of proteins in the tumor specimen. For this purpose, Radio-Immunoprecipitation Assay (RIPA) buffer is widely employed. RIPA buffer's rapid and highly efficient cell lysis and good solubilization of a wide range of proteins is further augmented by its compatibility with protease and phosphatase inhibitors, ability to minimize non-specific protein binding leading to a lower background in immunoprecipitation, and its suitability for protein quantitation. Results In this work, the insoluble matter left after RIPA buffer extraction of proteins from breast tumors are subjected to another extraction step, using a urea-based buffer. It is shown that RIPA and urea lysis buffers fractionate breast tissue proteins primarily on the basis of molecular weights. The average molecular weight of proteins that dissolve exclusively in urea buffer is up to 60% higher than in RIPA. Gene Ontology (GO) and Directed Acyclic Graphs (DAG) are used to map the collective biological and biophysical attributes of the RIPA and urea proteomes. The Cellular Component and Molecular Function annotations reveal protein solubilization preferences of the buffers, especially the compartmentalization and functional distributions. It is shown that nearly all extracellular matrix proteins (ECM) in the breast tumors and matched normal tissues are found, nearly exclusively, in the urea fraction, while they are mostly insoluble in RIPA buffer. Additionally, it is demonstrated that cytoskeletal and extracellular region proteins are more soluble in urea than in RIPA, whereas for nuclear, cytoplasmic and mitochondrial proteins, RIPA buffer is preferred. Extracellular matrix proteins are highly implicated in cancer, including their proteinase-mediated degradation and remodelling, tumor development, progression, adhesion and metastasis. Thus, if they are not efficiently extracted by RIPA buffer, important information may be missed in cancer research. Conclusion For proteomics of solid tumors, a two-step extraction process is recommended. First, proteins in the tumor specimen should be extracted with RIPA buffer. Second, the RIPA-insoluble material should be extracted with the urea-based buffer employed in this work.


Background
Over the past few years, proteomics has emerged as a powerful new technology, capable of generating unprecedented details of protein maps in a wide range of cell types and disease processes. Increasingly, however, it is becoming recognized that the success of a proteomic experiment is critically dependent on the sample preparation step. An ideal sample prep protocol should not only isolate as much of the proteins of interest as possible from the biological source, but also preserve optimal sample integrity and morphology. It should also present the entire sample in a form that is compatible with optimum mass spectrometric analysis.
Proteins in their native states are generally embedded in their natural environments where they are associated with other proteins, biological macromolecules or other matrix materials. They may also be components of multi-protein complexes, integrated into plasma membranes or organelles. They are generally insoluble in their native states once isolation from their biological environments. They must therefore be denatured in order to bring them into solution. This ultimately entails dissociating the chemical bonds connecting them in their native states. The bonds, and appropriate agents/methods for dissociating them [1] include: disulfide bond (reduction & alkylation), hydrogen bond (chaotropes), electrostatic interactions (salts, charged detergents, chaotropes), chargedipole (chaotropes), dipole-dipole (strong dipolar molecules), van der Waals (salt, dipolar molecules, chaotropes), and hydrophobic interactions (salts, dipolar molecules, chaotropes).
Effective sample preparation for proteomics disrupts these associations, and solubilizes as large a subset of the proteins as possible. Sample solubilization buffers typically contain a number of additives (chaotropes, detergents, reducing agents, buffers, salts, and ampholytes). In proteomics, perhaps two of the most effective and widely employed lysis buffers for extracting proteins from cells and tissues are Radio-Immunoprecipitation Assay buffer (RIPA buffer) [2] and urea lysis buffer [1,3,4].
The base ingredients of a typical RIPA buffer include: 50 mM Tris HCl pH 8, 150 mM NaCl, 1% NP-40, 0.5% sodium deoxycholate and 0.1% SDS. Protease and phosphatase inhibitors are additionally added prior to use, depending on the application (these are usually not added when preparing lysates for phosphatase assays). Additional optimization of the lysis procedure, or substitution of the base ingredients may be required for each specific application, for example PBS pH 7.4 can substitute for both Tris HCl and NaCl. An example variant of RIPA buffer that contains protease and phosphatase inhibitors consists of: 10 mM Tris, pH 7.4, 100 mM NaCl, 1 mM EDTA, 1 mM EGTA, 1 mM NaF, 20 mM Na 4 P 2 O 7 , 2 mM Na 3 VO 4 , 0.1% SDS, 0.5% sodium deoxycholate, 1% Triton-X 100, 10% glycerol, 1 mM PMSF (made from a 0.3 M stock in DMSO) or 1 mM AEBSF (water soluble version of PMSF), 60 μg/mL aprotinin, 10 μg/mL leupeptin, 1 μg/ mL pepstatin (alternatively, protease inhibitor cocktail may be used).
RIPA buffer's rapid and efficient cell lysis and solubilization of a wide range of proteins, including cytoplasmic, membrane and nuclear proteins, makes it a standard for Western blotting. Its versatility is further augmented by its compatibility with protease and phosphatase inhibitors, stability, and ability to minimize non-specific proteinbinding interactions leading to low backgrounds in immunoprecipitation. Still, RIPA buffer is very compatible with a myriad of applications, including reporter assays, protein assays, immunoassays and protein purification. When protein quantitation is desired, RIPA buffer is the lysis buffer of choice due to its compatibility with the BCA Protein Assay, although it can denature kinases [5], and can disrupt protein-protein interactions in immunoprecipitation/pull down assays [6].
Urea buffer is another versatile and efficient cell and tissue lysing buffer whose typical composition include: TRIS base 40 mM, Urea 7 M, Thiourea 2 M, NP-40 or CHAPS 4%, DTT 10 mM. Urea is used at concentrations ranging from 5 to 9 M, often with thiourea at concentrations up to 2 M. The additive thiourea can dramatically enhance the solubility of a wide range of proteins -nuclear, membrane, cytosolic, and including even tubulin that is highly prone to aggregation, in urea buffer [1,3,4]. As in RIPA buffer, different detergents and buffers can be substituted for the buffering base, NP-40 or CHAPS, depending on the application. Urea inactivates proteases that degrade cellular proteins [7]. Therefore, there is little need to add protease inhibitors. However, urea and thiourea can hydrolyze to cyanate and thiocyanate, respectively, which can modify amino groups on proteins, (e.g. carbamylation of proteins by isocyanate), and this hydrolysis is promoted by heat.
Thus, in cancer proteomics, especially where it is desired to recover as much of the cellular or tissue proteins as possible, it is important not to completely rely on a single lysis buffer that could exclude an entire class of critically needed proteins, especially the low abundance proteins. Indeed, there are numerous proteomics publications that fall exactly under the above category. In breast cancer proteomics, for example, articles are found (example Refs [23,[25][26][27][28][29][30]) wherein researchers wanted to extract all proteins present in the breast tumor specimens, but used RIPA buffer as the sole buffer.

Methods
The key steps ( Figure 1) include protein extraction from breast tumors and matched normal breast tissues, sample clean-up with GE Healthcare tools, trypsin digestion, desalting with Michrom cartridges, mass spectrometry/2D nano-LC/ESI-MS/MS, database search and protein ID, data processing and bioinformatics.

Protein extraction
Lysates provided by Protein Biotechnologies were extracted by a two-step procedure. First, proteins are extracted with a modified Radio-Immunoprecipitation Assay (RIPA) lysis buffer to yield the soluble fraction. Sec-Proteomics Workflow: flow chart ond, the residual insoluble fraction left after RIPA buffer extraction is subjected to additional extraction using a urea-based buffer to produce a second protein fraction. The compositions of the lysis buffers are as follows:

Cleanup of lysates
In the cleanup step, the proteins are separated from buffers, detergents, salts and other contaminants, using a method that is largely derived from a protocol and 2D clean-up kit provided by Amersham Biosciences (GE Healthcare, Piscataway, NJ). The kit consists of four reagents: a precipitant that precipitates the proteins to form pellets, a co-precipitant that co-precipitates with the proteins and enhances their removal from solution, a wash buffer that removes non-protein contaminants from the protein precipitate, and a wash additive that promotes rapid and complete re-suspension of the sample proteins.
Prior to the beginning of clean-up, the wash buffer was chilled at -20°C for 1 hr. After thawing and spinning down 100 μg aliquots of breast tumor lysates and matched normal breast tissue lysates, 300 μL of the precipitant was added. The mixture was vortexed on Eppendorf Thermomixer R (Eppendorf North America, Westbury, NY), and then incubated in ice for 15 minutes. Next, 300 μL of co-precipitant was added and the mixture mixed. The mixture was centrifuged at 12000 × g for 5 minutes to pellet the proteins. The clear supernatant liquid was carefully pipetted out while retaining the protein precipitate at the bottom of the 1.5 mL Eppendorf tube. Without disturbing the pellet layer, 40 μL of co-precipitant was added to the supernatant, through the tilted side of the 1.5 mL Eppendorf tube. The mixture was kept in ice for 5 minutes before centrifuging it again at 12000 × g for another 5 minutes. The pellet was dispersed by adding 25 μL of MilliQ water and centrifuging for 10 minutes. After adding 1 mL of chilled wash buffer at -20°C and 5 μL of wash additive, the mixture was vortexed once every 30 seconds for a total of 35 minutes. At this point, the proteins did not dissolve, but dispersed. The mixture was again centrifuged at 12000 × g for 5 minutes. The supernatant was carefully discarded, and the pellet dried. The pellets are amorphous.

In-solution digestion
The dried pellet was re-suspended in 20 μL 8 M urea/100 mM ammonium bicarbonate (ABC), and 0.6 μL of 100 mM Dithiothreitol (DTT) in 100 mM ABC (i.e. 3 mM DTT) was stirred in Eppendorf Thermomixer R (Eppendorf North America, Westbury, NY) for 1 hr at 29°C. After adjusting to room temperature, 1.5 μL of 200 mM iodoacetamide (IAA) in 100 mM ABC (final concentration of 15 mM IAA) was added. Alkylation was then carried out by incubating the mixture for 45 minutes in a darkroom.

Nanospray
The nanospray is a Paradigm Nanotrap Platform (Michrom BioResources, Auburn, CA) equipped with a Paradigm Metal spray needle. The spray tip is a 7.5 cm long, 30 μm (Internal Diameter) × 105μm (Outer Diameter) surgical stainless steel, electrochemically cut and polished, and sheathed by a 125 μm PEEK Tubing. The needle permits flow range of 0.5 to 10 μL/min and a voltage range of 1000 to 5000 Volts. A 1/16" stainless steel Valco nut attaches the spray needle to a 1/16" to 1/16" Valco union, which is mounted on the Nanotrap Platform.

Mass spectrometry
Data-dependent MS and MS/MS spectra are acquired on an LCQ Deca Xp plus (Thermo Fisher Scientific, San Jose, CA).

MS and MS/MS
Five scan events are recorded for each data acquisition cycle. The first scan event is used for full scan MS acquisition from 300-1800 m/z. Data are recorded in the centroid mode only (scan event #1 does not permit profile mode of data acquisition). The remaining four scan events are used for Collisionally Activated Decomposition (CAD): the four most abundant ions in each MS are selected and fragmented to produce product ion mass spectra. All CAD product ions are recorded in the profile mode.

Filters
Only peptides identified as possessing fully tryptic termini (containing up to two missed internal trypsin cleavage sites), with cross-correlation scores (X corr ) greater than 1.9 for singly charged peptides, 2.3 for doubly charged peptides and 3.75 for triply charged peptides, are used for peptide identification. In addition, the delta-correlation scores (ΔC n ) must be greater than 0.1 for peptide identification. Protein probability P(pro) ≤ 0.001.

Bioinformatics
Bioinformatics calculations are carried out using Blast2GO [ [36] for more information on BLAST parameters).
The blast server accepts only fasta-formatted protein sequences as input queries. Although Bioworks 3.2 or later can convert protein sequences into fasta text files, the protein sequences must be submitted from within Bioworks browser prior to exiting the initial protein identification step. Thus, batch conversion of protein queries, post-Bioworks, is not possible via Bioworks route. For example, Bioworks would not allow the analysis of only the proteins that occur in both tumor and normal, because these must be determined post-Bioworks protein identification. Another approach is Batch Entrez at the NCBI website http://www.ncbi.nlm.nih.gov/entrez/batchent rez.cgi?db=Nucleotide), with *Protein database* selected to import the batch file and displaying all in fasta format.

Mapping
In the Mapping step, various databases are searched to identify and fetch Gene Ontologies (GO) associated with the hits obtained from NCBI BLAST searches.

Annotation
The annotation procedure selects the GO terms from the GO pool obtained by the mapping step and assigning them to the query sequences, using Annotation Rule. Annotation parameters are: Pre-eValue-Hit-Filter, 6; Pre-Similarity-Hit-Filter, 30; Annotation Cut-Off, 55; GO-Weight, 5.
Annotations are validated and expanded using an annotation expander. The expander, developed by a group at the Norwegian University of Science and Technology http:// www.goat.no, deploys an additional Gene Ontology structure: the Second Gene Ontology Layer, to suggest new Biological Processes and Cellular Components, based on the gene's existing Molecular Function annotations.

Results
The multidimensional protein identification technology (MudPIT) mass spectra of the breast specimens T2-018 Tumor, T2-048 Tumor, T2-048 Normal and T2-029 Tumor, are shown in Additional Files 2, 3, 4 and 5, respec-tively. The set of 12-cycle spectra to the left of the figures are the RIPA buffer fractions, whereas the spectra of the 12 urea buffer fractions are shown at right.
The mass spectra clearly show that, for each of the 60minute MudPIT runs, the 1D_2 μL and 2D_10 μL spectra appear to produce higher ion currents than the rest of the spectra. This may be partly due to the greater concentrations of peptides in these runs: in the 1D_2 μL run, all two 2 μL of approximately 5 μg total peptide digest are introduced into the nano-LC/ESI-MS/MS system via the peptide Nanotrap and analytical column (the SCX column is bypassed for 1D analysis). And, in 2D_10 μL, all 10 μL of peptide digest are deposited on the SCX column; all unbound peptides are washed into the mass spectrometer, after pre-concentration and de-salting at the 150 μm × 50 mm peptide (40 nanoliter volume) nanotrap. Thus, the amounts of sample introduced into the mass spectrometer in these two runs maybe responsible for the higher ion currents.
The MudPIT spectra also show that the RIPA-insoluble materials contain quite a significant amount of proteins, as reflected in the highly abundant spectra of urea-soluble proteins. This clearly raises concern about using RIPA buffer as the sole lysis buffer for the proteomics of breast cancer (or other cancers as well).
The proteins identified in the database search of the Mud-PIT mass spectra are shown in Table 2. Here, T2-048T (RIPA) and T2-048N2 (UREA) represent the RIPA-soluble fraction of the tumor sample T2-048 and the urea-soluble fraction of the normal breast tissue sample T2-048, respectively. The column labelled "Default" depicts the default number of proteins found by Bioworks 3.2 prior to application of filter functions and protein validation. The final sets of proteins identified are shown at the rightmost column. Again, it is clear that the RIPA-insoluble materials, which dissolve in urea, contain significant amounts of proteins that would otherwise be discarded if RIPA buffer was the only lysis and solubilization buffer used.
The partitioning of the proteins between RIPA and urea buffers in the infiltrating Ductal Carcinoma case T2-018 TUMOR (Figure 2A) shows that the average molecular weight of the 217 proteins that dissolve exclusively in RIPA buffer was 61604 m/z, whereas the 73 proteins that dissolve exclusively in urea buffer have an average molecular weight of 99154 m/z. That is, the average molecular weight of proteins that dissolve exclusively in urea was 37551 m/z, or 61% higher than in RIPA buffer. Finally, the average molecular weight of the 82 proteins that dissolve in both RIPA and urea buffers was 40490 m/z.
The corresponding data for T2-048 TUMOR, T2-048 NORMAL and T2-029 TUMOR are shown in Venn Diagrams in Figures 2B, C and 2D.

Gene Ontology
In an effort to determine the protein groups, and identify the fractionated proteins contained in the compartments shown in Table 2 and Figure 2, Gene Ontology [37,38]http://www.geneontology.org/ was employed.
The Gene Ontology (GO) project, which began in 1998 as a collaboration between three databases FlyBase, Saccharomyces Genome Database and the Mouse Genome Database, has today grown to encompass nearly all major databases, and has become a new powerful tool for mining biology data. Gene Ontology annotates biological data in terms of their Biological Process, Molecular Function and Cellular Component.
Biological Process is an ensemble of biochemical transformations that are accomplished by one or more ordered assemblies of molecular functions. Biological Process may be broad (physiological process, metabolism, etc) or specific (nitric oxide metabolism, oxygen transport, etc).
Molecular Function is the specific, elemental action or task performed by a gene product or assembled complexes of gene products. Examples of broad molecular functions include catalysis, binding, and structural molecular activity, whereas specific molecular functions are exemplified by tetrapyrrole binding, adenylate cyclase activity and calmodulin binding.
Cellular Component is the subcellular location (organelle, nucleus, etc) and macromolecular complexes were the gene product is located.
In this work, proteins identified by proteomic analyses are submitted to GO analyses, including NCBI BLAST, mapping and annotation. The results are presented as a Directed Acyclic Graph (DAG), which shows the number of annotated sequences and the annotation scores contributing to each node. The nodes are color-coded, and the relative importance of each annotation score is indicated by the intensity of the orange color at that node. There are three types of nodes: a double-edged octagon represents an annotated GO term; a rectangle represents a non-annotated GO term node, and an oval shape denotes Gene Ontology obtained by mapping which can directly be directly associated to one or more BLAST hits.
In the Biological Process formulation, functionalities that directly identify the proteins in the RIPA and urea buffer fractions are not explicitly evident. The Molecular Function annotations, and especially, the Cellular Component, however, do provide highly useful information on protein groups and identities of the protein fractions, shown in the Venn diagrams of Figure 2. In the following sections, the Cellular Component annotations will be primarily used to characterize the protein fractions, although, the Molecular Functions annotations will also be used.

T2-018T (RIPA) and T2-018T (Urea) fractions
The Cellular Component DAGs for specimens T2-018T (RIPA) and T2-018T (Urea) are shown in Figures 3 and 4, respectively. Interestingly, the Cellular Component parent node of T2-018T DAG shows that mapping found curated Gene Ontologies (Ontologies) for 262 out of the original 299 proteins found in this sample (Table 2). Similarly, in T2-018T, 131 out of the 155 proteins have Ontologies available in the Go databases. The high percentages of proteins that have curated Ontologies thus provide adequate bioinformatics data needed to characterize the RIPA and urea proteomes with high degree of specificity and reliability.
Comparison of the T2-018T (RIPA) and T2-018T (Urea) DAGs shows that the entire set of extracellular matrix protein nodes in the urea (T2-018T; Figure 4) DAG are almost completely missing from the RIPA (T2-018T; Figure 3) DAG, at the indicated node filter settings, suggesting that nearly all extracellular matrix proteins are dissolved in the urea, but not in the RIPA buffer. The node filter is a mechanism of simplifying otherwise complicated DAGs. At a node filter of 28, for example, all nodes whose number of annotated protein sequences are 28 or below, are not displayed, n an effort to simplify the DAG. Thus, when the node filter is lowered, previously hidden nodes are displayed, making the chart more crowded.
Close-up sections of the extracellular regions ( Figure 5) clearly show that extracellular matrix proteins dissolve almost exclusively in urea buffer. The cropped sections are obtained with node filters of 15 and 10, for the RIPA and urea DAGs, respectively.
Interestingly, lowering the DAG node filter for the RIPA DAG did not produce appreciable change in the number of nodes displayed within the extracellular region shown in Figure 3, whereas, even a slight lowering of the node filter in the urea DAG ( Figure 4) reveals a large number of previously hidden nodes, shown here in Figure 6, when the urea node filter is reduced to zero. Again, this is a further confirmation that most of the extracellular matrix proteins are dissolved in the urea buffer.
Comparison also shows that, for extracellular region, RIPA buffer has a greater number of annotated protein sequences and annotation score than Urea buffer: [RIPA: (extracellular region, Seqs: The Cellular Component DAG for the proteome T2-018T (RIPA) Figure 3 The Cellular Component DAG for the proteome T2-018T (RIPA). Extracellular matrix proteins are not observed, even at a node filter setting of 28.
mitochondrial proteins are preferentially enriched in the RIPA fractions.
Selective protein enrichment comparisons based on both Seqs and Scores are shown in Figures 7 and 8.

T2-048T (RIPA) and T2-048T (Urea) fractions
Rather than show the entire DAGs, cropped views of the extracellular regions of T2-048T (RIPA) and T2-048T (Urea) DAGs ( Figure 9) show that extracellular matrix proteins are present almost exclusively in the urea fraction. It is also seen that mapping found that nearly 90% of the original proteins present in the T2-048T proteome (i.e. 232 of 261, Table 2) have existing Ontologies, thus providing the requisite bioinformatics information needed for the characterization of the proteins. Similarly, Ontologies are found for 120 of the 143 proteins (84%) of the T2-048T proteome.

T2-029T1 (RIPA) and T2-029T2 (Urea) fractions
The cropped-out extracellular regions, displayed side-byside (Figure 11), again show that extracellular matrix proteins are found nearly exclusively in urea buffer. And, mapping found Ontologies for 84% and 87% of the T2-029T1 and T2-029T2 proteins, respectively. Upon lowering the node filter on the T2-029T2 DAG to zero, a full display of the extracellular matrix proteins are obtained (Additional File 6). Similar lowering of the node filter on the T2-029T1 DAG did not reveal significantly new or relevant structural information.
The Cellular Component DAG for the proteome T2-018T (UREA) Figure 4 The Cellular Component DAG for the proteome T2-018T (UREA). At a node filter setting of only 13, extracellular matrix proteins are highly evident. Thus, extracellular matrix proteins are soluble primarily in urea buffer.

Molecular Function
Molecular Function annotations also contain elements that reveal protein solubilization preferences of the RIPA and urea buffers. Thus far, Structural Molecule Activity (SMA) is the only functionality in the Molecular Function annotation that contains nodes that directly relate to the protein solubility preferences. In general, the SMA node of the urea proteome contains a child node, Extracellular Matrix Structural Constituent, which contains the number of annotated protein sequences and an annotation score for extracellular matrix proteins. The SMA node of the RIPA proteome does not contain any descriptors (at the given node filter settings) that suggest the presence of Extracellular Matrix Structural Constituent. Thus, extracellular matrix proteins are observed nearly exclusively in the urea fraction.
The Molecular Function DAGs for T2-018T (RIPA) and T2-018T (UREA) (Figures 12 and 13, respectively) show clearly that Extracellular Matrix Structural Constituents are present only in the urea proteome. The RIPA and urea Close-up views of the extracellular regions of T2-018T (UREA) Figure 5 Close-up views of the extracellular regions of T2-018T (UREA). clearly show that extracellular matrix proteins dissolve almost exclusively in urea buffer.
Expanded view of the extracellular region of the cellular component DAG for the proteome T2-018T (UREA) Figure 6 Expanded view of the extracellular region of the cellular component DAG for the proteome T2-018T (UREA). The node filter was reduced to 0 to obtain this complete display. Lowering the DAG node filter for the RIPA DAG did not produce appreciable change in the number of nodes displayed within the extracellular region.
DAGs are drawn with node filter settings of 30 and 12, respectively.
In Figure 14(A-B) are the cropped-out SMA nodes for T2-018T (RIPA) and T2-018T (UREA), when the node filters are set to 2 and 1, respectively. Further lowering of the node filters did not produce significant differences in both DAGs. Also shown in Figure 14(C-D), are the croppedout SMA nodes for T2-029T1 (RIPA) and T2-029T2 (UREA) proteomes. Again, extracellular matrix proteins are found nearly exclusively in the urea fractions.
The above trend is consistently maintained in Figure 15

Discussion
The solubility of proteins in RIPA and urea buffers depends on several physicochemical factors, including the characteristics of the proteins and properties of the RIPA and urea buffers.
Physicochemical properties of the proteins that affect their solubility include average charge, determined by the relative numbers of Asp, Glu, Lys, and Arg residues, and the content of turn-forming residues (Asn, Gly, Pro, and Ser) [39]. Insoluble proteins tend to have more hydrophobic stretches longer than 20 amino acids residues, lower glutamine content, fewer negatively charged residues, and higher percentages of aromatic amino acid residues than soluble ones [40]. Indeed, high contents of negatively charged amino-acid residues and absence of hydrophobic patches tend to improve protein solubility [41]. Also, low percentage of aspartic acid, glutamic acid, asparagines and glutamine residues increases the probability of a protein to be insoluble [41].
Solubility of proteins in lysis buffer also depends highly on the composition and gross physicochemical properties of the lysis buffer. These include [42,43]: the type of buffer, the presence or absence of phosphate, pH, salts, ampholytes, detergents, chaotropic agents, reducing agents (dithiothreitol (DTT), dithioerythreitol (DTE), βmercaptoethanol, tributyl phosphine (TBP), tris-carboxylethylphosphate (TCEP)). Figure 7 Extraction of proteins from breast tumors for proteomic analysis. nuclear proteins (A), intracellular proteins (C) and protein complexes (D), are more soluble in RIPA buffer than in urea buffer RIPA, whereas membrane proteins (B) are slightly more soluble in urea buffer.

Extraction of proteins from breast tumors for proteomic analysis
Extraction of proteins from breast tumors for proteomics Figure 8 Extraction of proteins from breast tumors for proteomics. proteins of the extracellular region (A) and cytoskeleton (B) are more soluble in urea buffer than in RIPA, whereas for cytoplasmic (C) and mitochondrial (D) proteins, RIPA buffer is preferred.

RIPA and UREA buffers fractionate breast cancer proteins primarily on the basis of molecular weights
RIPA buffer is a versatile and efficient lysis buffer suitable for the recovery of most proteins, including whole cell, nuclear, mitochondrial, membrane receptors, cytoskeletal-associated, and soluble proteins. However, as the data in Figures 2 shows, there is a high molecular weight cutoff (≥ 12% higher average molecular weight in urea than RIPA) for a protein's solubility in RIPA buffer. Proteins with molecular weights of around 100 kDa or higher may not dissolve readily, unless they possess unique structural features that enhance their solubility in RIPA buffer. Thus, when a given protein group is subjected to RIPA buffer extraction, the high molecular weight fraction may not dissolve -they are recovered as the RIPA-insoluble fraction that ultimately dissolve in urea buffer. This may explain why such a high percentage of proteins are recovered in the urea fraction after they have resisted solubility Interestingly, Ignatoski and co-workers [42] have also demonstrated that different lysis buffers solubilized different subsets of cellular proteins (rather than entire proteins), based primarily on the molecular weights. Neither RIPA nor urea buffer was, however, used in this kinase assay -they denature kinases. All buffers that they tested were non-denaturing.

Nearly all extracellular matrix proteins are insoluble in RIPA buffer, but dissolve readily in urea buffer
Perhaps the most important finding in this work is that nearly all extracellular matrix proteins (ECMs) are insoluble in RIPA buffer, whereas they dissolve readily in urea buffer. This may be due to ECMs having very high molecular weights. Why, then, is RIPA buffer being used routinely to dissolve extracellular matrix proteins by researchers, especially in cancer research? Indeed, RIPA buffer does dissolve high molecular weight proteins, but the recovery may be poor. The solubility of a protein is a combination of many factors beyond the nature of the lysis buffer. If, for example, the high molecular weight protein has structural features that enhance its solubility (high contents of negatively charged amino-acid residues and absence of hydrophobic patches [41]), as discussed in the introduction section, the protein would dissolve in RIPA buffer. On the other, a smaller molecular weight protein may surprisingly fail to dissolve in RIPA buffer, if it aggregates or possesses structural features that hamper its solubility. Some epigenetic, post-translation, or spontaneous structural changes can also impede a protein's solubility in RIPA buffer. One example is tau, which would normally be soluble in RIPA. But when it becomes hyperphosphorylated, for example, by endogenously overproduced Aβ protein in Alzheimer's disease, it would resist solubility in RIPA buffer [44].

Selective Enrichments of Protein Groups by RIPA and Urea Buffers
Data in Figure 7A shows that nuclear proteins are somewhat more selectively enriched in RIPA buffer than in urea, consistent with many standard molecular biology laboratory practices: RIPA buffer is one of the recommended (or one of the preferred) buffers for efficient recovery of nuclear proteins [6, 45,46].
Protein complexes ( Figure 7D) are also slightly more concentrated in RIPA buffer than in urea. Protein complexes tend to have high molecular weights, and although RIPA buffer has poor solubility for high molecular weight proteins, protein complexes are held together largely by noncovalent bonds. Detailed Cellular Component DAGs (DAGs not shown) indicate that protein complexes referred to here include: immunoglobulin complex, hemoglobin complex, fibrinogen complex, transcriptor factor complex, DNA polymerase complex, DNA-directed RNA polymerase II holoenzyme, RNA polymerase complex, nucleosome, myosin, laminin complex, membrane attack complex, mediator complex, tubulin, MHC protein complex and ribonucleocomplex.
Mitochondrial proteins also appear to be slightly more favored by RIPA buffer than urea ( Figure 8D). Again, RIPA buffer is one of the recommended lysis buffers for the recovery of mitochondrial proteins for Western blot [6].
On the other hand, urea buffer is clearly more efficient in selectively enriching extracellular region and cytoskeletal proteins ( Figures 7A and 7B), in addition to extracellular matrix proteins already discussed. Neither RIPA buffer nor urea buffer is significantly preponderant in selective enrichment of membrane, intracellular, or cytoplasmic proteins ( Figures 7B and 7C, and Figure 8C, respectively).

Limitations of RIPA and Urea Buffers
There is no single lysis buffer that would solubilize all classes of proteins, however. Each buffer has its pros and cons. Some of the known limitations of RIPA and urea buffers are highlighted below.

RIPA buffer-induced post-lysis modulation of biochemical pathways
RIPA buffer has been shown to alter some biochemical pathways, leading to experimental results that may be spurious. Hence, the need to verify data by using other lysis buffers. Two examples are provided herein.
In one example, DeSeau and co-workers [47] showed that the level of pp60 c-src kinase activity detected in immune complex protein kinase assays can be substantially modulated by RIPA buffer. They, thus, advise that comparing of the results of pp60 c-src in vitro protein kinase assays in other cellular systems where only RIPA buffer lysis has been used should be interpreted with caution. Specifically, they found that the in vitro protein kinase activity of pp60 c-src molecules derived from RIPA buffer lysates of colon carcinoma cells was elevated five-to sevenfold when compared with pp60 c-src from the same cells lysed in a buffer containing only Nonidet-P 40. Additionally, they found that in RIPA buffer, the difference in specific activity of pp60 c-src between normal colon mucosal cells and colon carcinoma cells is about ten-to thirtyfold, whereas with a lysis buffer containing only Nonidet-P 40 as a detergent, the difference would be less than three-to fourfold. Thus, if Nonidet-P 40 or other lysis buffers were not used in an effort to validate data obtained in RIPA buffer, the entire data on this work could have been in error.
In another example, abnormally high caspase-3 and -7 activity in stimulated human peripheral blood lymphocytes (PBLs) has been shown to be a spurious side effect caused by RIPA buffer that was used to lyse the activated T-lymphocytes [48]. In contrast, when a lysis buffer containing 2% SDS was used, the caspases remained in their zymogen proforms, and no proteolytic processing of caspase substrates was detected. It was subsequently determined that the release liberation of GraB or similar proteases from cytotoxic granules during the lysis procedure was responsible for artifactual activation of caspase-3. RIPA may disrupt GraB-containing granules more efficiently than 0.2% Nonidet P-40 or other lysis buffers used [48].

Many protein groups are insoluble in urea buffer
Although urea buffer has proven very effective in dissolving extracellular matrix proteins and a wide range of other protein groups, it nevertheless has limitations. In fact, Granier [49] noted that many membrane proteins are insoluble in urea, if extracted without heating. And, as mentioned in the Background section above, heating urea in the presence of proteins most likely would result in covalent modifications of the protein by the hydrolysis products produced by heating urea (e.g. carbamylation of proteins by isocyanate). Thus, urea does not possess universal solubility for all membrane proteins. Ames and Nikaido [50] solubilized membrane proteins of salmonella typhimurium with hot SDS when even the most powerful O'Farrell's buffer (urea buffer) [51] failed to dissolve the membrane proteins.
A cell surface proteoglycan, with a molecular weight of 450 kDa, was also found to be very insoluble in urea buffer [16].
In general, many proteins that have proven insoluble in urea buffer are shown to be human lens proteins [52,53], especially cataractous proteins [52][53][54]. Weber and McFadden described a heterogenous set of urea-insoluble proteins in dividing PC12 pheochromocytoma cells. They found that about 5% of the total cellular proteins synthesized in exponentially dividing PC12 pheochromocytoma cells remained insoluble even in 6 M urea [55].
A major factor that decreases the solubility of proteins in urea is the formation of disulfide cross-bridges, which can be acquired by a protein through aerobic oxidation of Molecular Function DAG for the proteome T2-018T (RIPA) Figure 12 Molecular Function DAG for the proteome T2-018T (RIPA). Extracellular matrix structural constituents are not seen, even at a node filter setting of 12.
thiol groups. Even a small molecular weight protein could become very insoluble in urea upon formation of disulphide bridges. This was the case with a 42 kDa Rec12 (Spo11) meiotic recombinase of fission yeast (Rec12 protein) [56] that was expressed in E. coli. Rec12 protein resisted solubility in 6 M urea, but was ultimately extracted with 6 M Guanidine hydrochloride [56]. Subsequent analyses showed that it has four disulfide bridges that impeded its solubility in urea. Human eye lens proteins acquire disulfide cross-bridges by exposure to hyperbaric oxygen (Reviewed in Ref. [52]). The eye lens proteins then become opaque, cataractous and resist solubility in urea buffer [52].
Another example is the human centrosomal protein which exists as a doublet of 62/64 kDa and is insoluble in even 8 M urea (a condition that would dissolve most known centrosomal proteins) [57].

Preferential solubilization of extracellular matrix proteins in urea lysis buffer: variables
Despite differences in the breast tumors analyzed in this work (Table 1), a common feature remains the preferential solubilization of extracellular matrix proteins in urea lysis buffer. Differences include (Table 1)

Conclusion
This work shows that most extracellular matrix proteins (ECM) in the breast tumors and matched normal tissues in this work are dissolved in the urea buffer fraction: they are mostly insoluble in RIPA buffer. Because ECMs are highly important in cancer, including tumor development, progression, adhesion and metastasis, important information may be missed in cancer research if they are not efficiently extracted by RIPA buffer.
This work also shows that RIPA and urea lysis buffers fractionate tissue proteins primarily on the basis of molecular weights. The average molecular weight of proteins that dissolve exclusively in urea buffer is higher (up to 60%) than in RIPA.
Protein complexes, nuclear, mitochondrial, cytoplasmic and intracellular proteins are more soluble in RIPA buffer than in urea, whereas membrane, cytoskeletal and extracellular region proteins are more soluble in urea buffer.