Biomarkers of HIV-1 associated dementia: proteomic investigation of sera

Background New, more sensitive and specific biomarkers are needed to support other means of clinical diagnosis of neurodegenerative disorders. Proteomics technology is widely used in discovering new biomarkers. There are several difficulties with in-depth analysis of human plasma/serum, including that there is no one proteomic platform that can offer complete identification of differences in proteomic profiles. Another set of problems is associated with heterogeneity of human samples in addition intrinsic variability associated with every step of proteomic investigation. Validation is the very last step of proteomic investigation and it is very often difficult to validate potential biomarker with desired sensitivity and specificity. Even though it may be possible to validate a differentially expressed protein, it may not necessarily prove to be a valid diagnostic biomarker. Results In the current study we report results of proteomic analysis of sera from HIV-infected individuals with or without cognitive impairment. Application of SELDI-TOF analysis followed by weak cation exchange chromatography and 1-dimensional electrophoresis led to discovery of gelsolin and prealbumin as differentially expressed proteins which were not detected in this cohort of samples when previously investigated by 2-dimensional electrophoresis with Difference Gel Electrophoresis technology. Conclusion Validation using western-blot analysis led us to conclude that relative change of the levels of these proteins in one patient during a timeframe might be more informative, sensitive and specific than application of average level estimated based on an even larger cohort of patients.


Background
HIV-1 penetrates the brain shortly after infection and remains there throughout entire disease. Approximately 50% of infected individuals develop some form of cognitive impairment ranging from an asymptomatic form diagnosed during formal testing to the most severe HIV-associated dementia (HAD) leading to death [1]. Although antiretroviral therapy (ART) has a profound effect on slowing disease progression, increasing survival and decreasing the number of HAD incidents from 30 to 7%, the rate of HIV-1 infected patients with HIV-associated Neurocognitive Disorders (HAND) remains the same [1,2]. In consequence, the prevalence of HAD has increased due to increased survival of these individuals [3][4][5][6][7]. These epidemiological data suggest that ART provides only partial protection against neurological damage in HIV-infected people [8].
Despite of more than 20 years of research efforts we are lacking good biomarkers supporting diagnosis of HAND including its most severe form, HAD [9,10]. Current diagnosis and identification of HAND is based on neuropsychological tests and exclusion of other potential causes such as opportunistic infections, tumor etc [11]. Laboratory tests of disease progression, although valuable, are not diagnostic and pose a need for more accurate and reliable markers to monitor progression of cognitive impairment [12][13][14]. Good and reliable diagnostic biomarkers are also indispensible for development of new therapeutic strategies. Discovery of biomarkers, which could be used to predict dementia and monitor disease progression, is important for the development of early and effective treatments designed to maintain normal cognition and quality of life [15,16].
Despite the technological progress in recent years in sample preparation for proteomic analyses, fractionation techniques and increased sensitivity of mass spectrometers, proteomic analysis of serum/plasma and cerebrospinal fluid (CSF) poses significant challenges [17][18][19][20][21]. High complexity and high dynamic range of proteins and peptides circulating in plasma and low levels of proteins originating from tissue leakage are just few of the most important challenges [22,23]. Immunodepletion of most abundant proteins from plasma/serum and CSF samples is the most common first step in reducing complexity of these samples. Although such approach has proven to be useful, further steps of sample fractionation are desirable [24].
Global proteomic profiling of clinical samples brought high expectations for accelerated discovery of new biomarkers to aid physicians in diagnosing and researchers in understanding molecular mechanisms of diseases. However, high dynamic range of plasma/serum and CSF proteins created challenges in such analyses. Immunodepletion became a standard first step, yet there is no consensus to how many of the most abundant proteins need to be removed. We have used IgY based technology for immunodepletion of CSF and sera samples in our previous studies [25,26]. Another challenge is the choice of a single or combination of profiling technology platforms. In our previous studies we used 2-dimensional electrophoresis (2DE) with Differential Gel Electrophoresis (DIGE) profiling method of immunodepleted CSF or sera from HIV-1 infected individuals with or without HAD [25,27] and demonstrated several differentially expressed proteins which can be potential biomarkers.
Although CSF surrounding the brain and spinal cord seems to be the best clinical material to reflect ongoing pathological processes [28][29][30][31][32], evaluations of the CSF proteome pose a challenge of availability of sufficient amount of protein in addition to high dynamic range of proteins. Alternatively, plasma or serum samples can be used. However, the question that remains is how closely changes in proteome profile of blood proteins reflect changes in the brain which is behind blood brain barrier (BBB). Because BBB is compromised during HIV infection, we posit that proteins from CSF leak into the blood and can be detected as biomarkers. In addition we expect that the plasma/serum proteome can be reflective of increased neuroinflammation. Although such surrogate biomarkers are not exclusive for HIV infection, they can be relevant and helpful as auxiliary tests [33].
Our current work indicates that multiple protein profiling approaches as well as multiple sample fractionation schemes are required to more completely assess changes in proteomes due to pathological changes [34]. Our data also indicates that biomarker levels should be measured relative to baseline for any individual to assess relative changes rather than comparing to a set threshold. Although putative biomarkers discovered during this study are unlikely to be stand alone measures of disease or suspected disease, they can be part of a broader diagnostic approach including psychological and brain imaging tests [35,36].

Proteomic profiling
Serum samples used in this study were provided by Neu-roAIDS Tissue Consortium and were obtained from HIV-1 infected individuals with or without HAD. Although introduction of ART resulted in significant decrease of HAD cases [1], we decided to use samples representing opposite spectrum of the disease to maximize chance in discovering biomarkers of an ongoing neurodegenerative process. Samples were immunodepleted from 12 most abundant proteins prior to proteomics profiling. For this purpose we used immunoaffinity chromatography with a column that is based on IgY technology [25,26]. In our previous proteomic investigation of this set of sera samples we used 2DE DIGE as profiling method and found three differentially expressed proteins: complement C3, ceruloplasmin and afamin [26]. Differential expression of two of them, ceruloplasmin and afamin, was further validated by western blot analysis [26].
In this study we used SELDI-TOF ProteinChip ® assays to profile proteomes of the same immunodepleted serum samples as in previous published study. We hypothesized that by applying different profiling approach we will discover differentially expressed proteins which were not found using 2DE DIGE. We choose SELDI-TOF profiling because of two reasons. One was to investigate proteins and/or their processed forms in low (~4 kDa) to medium low (~28 kDa) range of molecular weight for which 2DE DIGE is not an optimal profiling method. Our second reason for using SELDI-TOF ProteinChip ® technology was an ability of direct translation of chromatographic conditions from analytical to preparative mode as indicated as steps 3 and 4 in Figure 1. We have previously used this approach for identification of differentially expressed proteins [37]. Acquired SELDI-TOF spectra were subjected to rigorous statistical analysis and resulted in the identification of 50 peaks which potentially represent differentially expressed proteins. Among those 50 peaks detected, 21 showed statistically significant differences in intensities (p < 0.05) (Additional file 1, Table S1), however only 4 showed high significance, as illustrated in Table 1. Two peaks with m/z 4,493 and 25,872 showed increased intensity associated with HAD and 2 other peaks showed opposite trend (Table 1). Considering the highly variable nature of SELDI-TOF spectra we approached interpretation of such spectra with caution, therefore, some differences showing borderline significance may not be confirmed when larger number of spectra is generated.

Weak Cation Exchange (WCX) chromatography and 1DE
First dimension fractionation was immunodepletion of 12 most abundant proteins using IgY12 immunoaffinty chromatography. Therefore, preparative WCX chromatography was the second dimension fractionation reflecting conditions used in SELDI-TOF profiling. At the same time it allowed us to obtain amounts of protein sufficient for one more step of sample fractionation and mass spectrometry based protein identification. Despite of two steps of fractionation, samples were still complex enough to compromise detection of differentially expressed proteins. Therefore, we used 1DE in the subsequent step. Because we were specifically interested in low molecular weight proteins we used 16% Tricine gel and developed protein resolution only to the half of a distance. This approach allowed us to separate proteins in the region from 4 to 28 kDa while maintaining undiffused protein bands ( Figure 2A, boxes 1 to 8). As expected we observed differences in intensities of protein bands based on staining with SyproRuby reflecting differences in relative abundance of proteins. The most profound difference was observed in band 3 ( Figure 2) and as illustrated in Figure  2 we excised corresponding gel cubes which were further subjected to trypsin digestion and subsequent protein identification based on nano-LC-MS/MS analysis. Table 2 summarizes identified proteins (corresponds to Figure  2A). In the subsequent experiment we used 4-12% Bis-Tris gel to separate proteins in the high molecular weight region ( Figure 2B). We did not investigate proteins of this size using SELDI-TOF profiling. Our previous experience with this technology platform showed that while in a complex mixture very few proteins above 50,000 m/z mark can be detected using our criteria of the first and second pass. Such proteins were usually detected as broad peaks with low intensity and relatively high difference between theoretical and experimentally measured masses. Nevertheless, we predicted that the applied scheme of fractionation and profiling will result in uncovering differential expressed proteins in high molecular weight range. Results of 1DE analysis are shown in Table 3 (corresponds to Figure 2B). We observed one protein band which showed clear difference in abundance and was higher in HAD. Subsequent LC-MS/MS analysis of tryptic digest resulted in identification of gelsolin (Accession# gi|121116|sp|P06396 .1|GELS_HUMAN [121116]).

Western blot validation
Validation of putative biomarkers resulting from proteomic analyses poses a significant challenge. One of the reasons is high diversity in the levels of proteins within normal human population. For example, in CSF concentration of gelsolin ranges from 1.2 to 15.9 μg/ml [38].
Among validation methods available at this time we have chosen western blot analysis [25,26]. This experimental  Step 1: IgY12 partitioning

Experimental design
Step 2: SELDI-TOF (Table 1) Step 3: WCX 2 Chromatography Step 4: 1D SDS-PAGE Step 5: LC-MS/MS (Table 2)  approach is convenient because of relative high throughput and widely available software for quantitative analysis. Another advantage is small amount material required for such analysis which is important when quantity of sample is limited. Although usually serum/plasma is one of the most abundant clinical materials available for proteomic analysis, total pool of proteins obtained after immunodepletion of 12 most abundant proteins contains approximately 4-5% of the initial amount [22]. In this step, we used western blot combined with quantitative densitometry measurements of resulting bands to analyze levels of gelsolin in all 14 samples individually. Results presented in Figure 3 show a trend in increased expression of gelsolin in sera samples of patients with HAD; however, no statistically significant difference has been found between these two populations of samples. Lack of signif-icant difference can be attributed to small sample number and the presence of outliers. We further hypothesize that because baseline level of gelsolin within normal human population is substantially variable, less profound changes due to pathological conditions when averaged may not yield statistical significance. Therefore, change observed for any given patient over time might be much more informative. This, however, needs to be investigated in a longitudinal study using samples from three groups of patients: one, those who showed signs of progression of cognitive impairment, two, those who showed signs of no change and three, and those who showed signs of improvement.

Conclusion
It has been previously shown that one proteomic platform and/or profiling approach does not uncover all existing differences which can be valuable biomarker candidates [24,[39][40][41]. This is also true for current investigation. Therefore, in this report we present continuation of our previously published profiling of sera from HIV-1 infected individuals with or without CI [26]. We used the same cohort of samples, however, different profiling approach and now found that gelsolin and prealbumin were differentially expressed. Previous approach using 2DE DIGE did not show differences in expression of these two proteins. This could be attributed to the fact that both proteins were located in high and low regions of gels used in 2DE, where resolution of spot is less favorable. Validation of our data using western-blot analysis did not reach statistical significance difference between average values because of high variability of expression of these two proteins within population. There were also few samples which we classified as outliers. If these samples are removed from analysis, statistical significance can be achieved. It has to be noted that the only criterion for sample classification was clinical diagnosis of HAD or lack of it. The far high or far low levels of proteins (biomarkers) might be attributed to other factors such as opportunistic infections etc. Nevertheless, addition of gelsolin as a biomarker candidate to ceruloplasmin and afamin as previously reported by us [26], reinforced our classification of HAD and non-HAD samples. Figure 4 represents com-   1D Electrophoresis Figure 2 1D Electrophoresis. 1D SDS-PAGE. ND and HAD pooled samples were first depleted from 12 most abundant proteins, then processed through a weak cation exchange column, then 20 ug of each sample was loaded on 16% tricine gel and stained with sypro ruby. Numbers on the right-hand side of gels indicate band number that corresponds to Table 2 and 3, respectively. ND-non-demented; HAD-demented.  bined results from previous [26] and current study. Another conclusion that we can make based on this study is that for future experiments cohorts of samples need to be assembled based on very careful clinical diagnosis of patients to eliminate variation introduced by other concurrent pathologies, e.g. HCV. This, in turn may reduce variability within the groups.
Summarizing, our main conclusion from this study is that because of high diversity of expression of some proteins among individuals, the relative change over time intervals such as treatment for any given patient might serve as more useful biomarker to aid other means of diagnosis of cognitive impairment than calculated average levels among large cohorts which will not generate statistically significant differences. HAND is a relatively slowly progressing disease therefore changes measured in weeks or even months intervals can still be useful as aid in diagnostics.

Methods
For this study, we used 14 sera samples from HIV-1 infected individuals with (HAD) or without (ND) HIV-1associated dementia (   previously obtained from the National NeuroAIDS Tissue Consortium (NNTC, https://www.nntc.org/) under Request# R101. Use of sera samples in this study has been approved by UNMC Institutional Review Board (#196-05-EX).

SELDI-TOF
Protein signatures of individual sera samples were performed by SELDI-TOF ProteinChip ® assays (Bio-Rad, Hercules, CA). The chip type selection (WCX2) and washing conditions were previously optimized [37]. Briefly, each chip was pretreated with 10 mM HCl, rinsed with HPLC grade water, then equilibrated with binding buffer (100 mM ammonium acetate, pH 4.0, with 0.1% Triton X-100). Each sample was diluted in binding buffer at a concentration of 0.02 ug/uL, with a total of 1 ug applied to each spot and incubated at room temperature for 30 min while shaking. Unbound proteins were removed by washing spots twice with binding buffer followed by washing with HPLC grade water. After drying each spot, 50% sinapinic acid (SPA) matrix was added to each spot, airdried, then reapplied. SPA was prepared as a saturated solution containing 30% acetonitrile (ACN), 15% isopropanol, 0.5% trifluoroacetic acid (TFA) and 0.05% Triton X-100. The ionized proteins and their molecular mass/ charge (m/z) ratios were detected using SELDI-TOF. The

HPLC (second dimension fractionation)
HPLC protein fractionation was performed using a liquid chromatography system (Shimadzu, Columbia, MD), which included a pump, system controller, manual injector with 500 uL injection loop, ultraviolet-visual (UV-Vis) detector set at 220 nm and fraction collector. The system was controlled using a Dell computer and EZStart chromatographic software (Shimadzu

Protein Identification
After tryptic digest, samples were purified using reverse phase C 18 Zip Tips ® (Millipore, Billerica, MA) according to manufacturer's procedure and re-suspended in 0.1% formic acid in water prior to LC-MS/MS analysis. Protein identification was performed as described previously  using ESI-LC-MS/MS system (LTQ-Orbitrap, Thermo Scientific, Inc., San Jose, CA) in a nanospray configuration using a microcapillary RP-C 18 column (New Objectives, Woburn, MA) for fractionation. The spectra were searched using Sequest™search engine in Bio-Works 3.2 software (Thermo Scientific Inc., San Jose, CA) using the following parameters: threshold for Dta generation = 10000, precursor ion mass tolerance = 1.4, peptide tolerance = 2.00 and fragment ions tolerance = 1.00. Database NCBI.fasta from http://ftp.ncbi.nih.gov was used with two missed cleavage sites allowed and at least two peptides were required for protein identification.

Western Blot Assays
1DE was performed on 7 HAD and 7 ND sera samples using NuPAGE gel system (Invitrogen Corp., Carlsbad, CA) in 4-12% gradient Bis-Tris gels under reducing conditions. For Western blot analyses, 2 μg of serum protein immunodepleted on an IgY column were loaded per lane. The gel was transferred to Immun-Blot PVDF transfer membrane using Ready Gel™Blotting Sandwiches (Bio-Rad, Hercules, CA). After blocking with 5% rabbit serum in PBST, the membrane was incubated with mouse antigelsolin antibody (BD Transduction Laboratories, San Jose, CA) followed by incubation with horseradish peroxidase-conjugated goat anti-mouse IgG (Jackson Immu-noResearch, West Grove, PA). A chemiluminescent signal was detected using SuperSignal West Pico™Chemiluminescent Substrate (Pierce, Rockford, IL) and signal was recorded on Blue Lite X-ray film (ISCBioExpress, Kaysville, UT). Images were scanned into Adobe Photoshop, adjusted using "Auto levels" and then analyzed using ImageJ software available through NIH. Image was inverted and a measurement box of exact same size was used for each band analysis. All numbers were exported into Microsoft Excel and measurements were normalized between membranes.

Statistical Analysis
For SELDI-TOF data analysis, data from Biomarker Wizard were exported for statistical analysis using SAS ® software 9.1 (SAS Institute Inc., Cary, NC). Generalized estimating equations (GEE) were used to identify peaks that showed statistically significant differences in the distribution of intensity scores among the various replicates of HIVinfected individuals with or without HAD. The raw inten-sity values were found to be asymmetrical and transformed prior to analysis using the arsinh function: Y = log 2 (X + SQRT [X**2 + 1]), where 'X' is the observed intensity. This transformation has been used previously to stabilize intensity variance and make data more normally distributed and it has the advantage over a log-transformation of being able to transform negative values [21]. After the GEE modeling, the Bonferroni correction was used to address the issue of multiple testing.