 Research
 Open Access
 Published:
Gaussian process regression model for normalization of LCMS data using scanlevel information
Proteome Sciencevolume 11, Article number: S13 (2013)
Abstract
Background
Differences in sample collection, biomolecule extraction, and instrument variability introduce bias to data generated by liquid chromatography coupled with mass spectrometry (LCMS). Normalization is used to address these issues. In this paper, we introduce a new normalization method using the Gaussian process regression model (GPRM) that utilizes information from individual scans within an extracted ion chromatogram (EIC) of a peak. The proposed method is particularly applicable for normalization based on analysis order of LCMS runs. Our method uses measurement variabilities estimated through LCMS data acquired from quality control samples to correct for bias caused by instrument drift. Maximum likelihood approach is used to find the optimal parameters for the fitted GPRM. We review several normalization methods and compare their performance with GPRM.
Results
To evaluate the performance of different normalization methods, we consider LCMS data from a study where metabolomic approach is utilized to discover biomarkers for liver cancer. The LCMS data were acquired by analysis of sera from liver cancer patients and cirrhotic controls. In addition, LCMS runs from a quality control (QC) sample are included to assess the run to run variability and to evaluate the ability of various normalization method in reducing this undesired variability. Also, ANOVA models are applied to the normalized LCMS data to identify ions with intensity measurements that are significantly different between cases and controls.
Conclusions
One of the challenges in using labelfree LCMS for quantitation of biomolecules is systematic bias in measurements. Several normalization methods have been introduced to overcome this issue, but there is no universally applicable approach at the present time. Each data set should be carefully examined to determine the most appropriate normalization method. We review here several existing methods and introduce the GPRM for normalization of LCMS data. Through our inhouse data set, we show that the GPRM outperforms other normalization methods considered here, in terms of decreasing the variability of ion intensities among quality control runs.
Background
Liquid chromatography coupled with mass spectrometry is one of the promising high throughput tools for identification and quantification of biomolecules extracted from serum, plasma, tissue, etc. Analysis of a sample by LCMS typically generates three pieces of information: a pair of masstocharge ratio (m/z) and retention time (RT) along with a related ion intensity. Following preprocessing of data from a set LCMS runs, a data matrix is created with each row and column representing a feature (RT, m/z) and a sample, respectively. Assuming, pf eaturesandn samples, we consider in this paper a p × n data matrix.
Normalization of the preprocessed LCMS data is considered before statistical analysis to decrease undesired bias [1]. The bias can be from differences in sample collection, biomolecule extraction, or from column separation nonlinearity, ionization variability, etc [2]. The importance of the sample preparation step to achieve consistent results in different runs of the same experiment was emphasized in recent studies [3].
To the best of our knowledge, limited studies investigated the performance of existing normalization methods through real LCMS data [2, 4]. In these studies, a pooled mixture of multiple samples is utilized to generate replicate QC runs. Then, the QC runs are utilized to estimate and correct the bias.
In [5] we reviewed most of the existing methods for normalization of LCMS data. Some of the methods were modified and all methods were employed to conduct an evaluation of their performances on a real data set. In this study, we expand the aforementioned work by introducing a new normalization method using Gaussian process regression to capture the variation of ion intensities. We use maximum likelihood approach to find the parameters for the fitted stochastic process. Then the learned model is used to correct for the drift based on analysis order. This approach can be used with either preprocessed data [6] or the raw data (scanlevel data). The latter allows us to capture information that may be lost during preprocessing, but also deals with more complex data.
A data set generated from both experimental and QC samples is used here to assess normalization methods. We use the number of ions with significant variation within the QC runs as a measure to evaluate the run to run variability for each ion. From this point of view a normalization method is assumed to have better performance if it can decrease this variation for more ions. In addition, crossvalidation is used to evaluate the performance of each normalization method. The methods are further compared based on their ability to detect statistically significant ions between cases and controls. In other words we look into different batches of data in the same experiment with dependent and independent set of samples and compare the normalization methods based on their ability to increase the number of statistically significant ions overlapping among different batches. However, we do not use this criteria to rank the methods as the groundtruth is not available. The variability within the QC runs is utilized to compare the GPRM with other normalization methods reviewed in this paper and particularly with those that use analysis order for normalization.
Results and discussion
We propose a normalization method based on a Gaussian process regression model (GPRM). The method can be used for normalization of either integrated peak intensities or on the basis of scanlevel intensities from EICs. We denote the former GPRM and the latter GPRMEIC. The performance of GPRM and GPRMEIC are compared with two different sets of normalization methods: analysis orderbased normalization methods (Table 1) and other normalization methods reviewed in this paper (Table 2). Table 2 shows only the top three normalization methods with the highest performance.
By comparing all the reviewed methods across different batches in the data set, we observed that three methods, TIC, MedScale, and Quantile normalization, showed better performance consistently [5]. As shown in Tables 1 and 2, both GPRM and GPRMEIC reduce the percentage of ions with q _{ ζ } < 0.1 compared with other normalization method or unnormalized data (see Evaluation Method). This indicates that our proposed methods lead to a decrease in the number of features with significant variation across the QC runs.
Also we examined to find whether normalization helps with detection of new significant differences. This is done based on LCMS data from multiple experiments with dependent and independent batches of sample sets (see LCMS Dataset). By looking at withinbatch combinations, i.e. dependent sets of samples, we can determine which method is consistent for the same set of samples in terms of finding more number of statistically significant features (Figure 1). On the other hand, considering between batch comparisons enables us to evaluate the methods based on independent samples (Tables 1 and 2). In general for the same method with different modifications, the decrease in variance of ζ in equation (20), leads to more number of ions selected as statistically significant.
From Table1t can be seen that among analysis orderbased normalization methods, our proposed approach has the highest efficiency in decreasing the variability within the QC runs. Table 1 shows that these methods also outperformed other normalization methods by considering the same measure for estimated variability of QC runs. We think that GPRM can perform better as it handles the drift by using a stochastic model and optimization to find the parameters. In comparison, other analysis orderbased normalization methods work with limited possible values for parameters and as a result they may not reach the highest possible performance. Also by taking advantage of the scanlevel intensities from EICs, GPRMEIC is able to achieve better performance than GPRM. However, GPRMEIC requires appropriate alignment of the scanlevel peaks. Finally, by merging information across different scans, 2DGPRMEIC showed the best performance.
Comparing Tables 1 and 2 reveals that although some methods show less decrease within QC variability, but they lead to more number of ions selected as statistically significant between cases and controls. For example Quantile method reduced the percentage of the ions with q _{ ζ } < 0.1 to 4.0% and 3.0% for B1 and B2 respectively in positive mode (Section: Evaluation Methods). In comparison GPRM achieved 2.5% and 2.2%. However the number of ions selected as statistically significant between cases and controls are 62 and 37 for Quantile and GPRM respectively in positive mode. As pointed out before, the ground truth is not available to evaluate the performance of the normalization methods on the basis of detected differences between cases and controls. Thus, all ions found statistically significant are regarded as potential candidates until subsequent verification is conducted to determine if the differences are true or biologically meaningful. However, the LCMS runs are expected to yield better reproducibility following normalization. In particular, the QC runs in this study are anticipated to have the least variability, at least for a considerable subset of the analytes.
Conclusions
Systematic bias is one of the challenges in quantitative comparison of biomolecules by LCMS. Various normalization methods have been proposed to address this issue. However, there no universally applicable solutions at the present time. Thus, each LCMS data set should be carefully inspected to determine the most appropriate normalization procedure. Since most of the evaluation studies have been performed on data from relative a small sample size without adequate replicates and QC runs, additional investigations on largescale LCMS data are needed [5].
We reviewed several existing normalization methods in this paper. Also a new method for normalization of LCMS data is introduced. The method uses the analysis order information in a Gaussian process model. Compared to other methods that also use analysis order information, our model has some advantages. It can model the bias from instrument drift more efficiently as a statistical approach is used which includes noise in the model to estimate the parameters through optimization. Therefore it is more precise in estimation of the scale parameter compared to some analysis orderbased methods which search heuristically for the span parameter of the smoothing algorithm. In addition we extended this method to perform normalization on the basis of EICs obtained from raw LCMS data instead of the preprocessed peak list.
We evaluated the performance of the GPRM and other existing normalization methods using our inhouse LCMS data generated from both experimental (cases and controls) and QC samples. The QC runs were used to estimate and correct the drift in the ion intensities. The normalization methods were assessed based on two criteria: (1) the decrease in the withinsample variability; (2) the number of extra ions selected as statistically significant compared to those obtained without normalization. The first criterion is used to rank the models based on their performance, while the second criterion used to investigate the effect of normalization in terms of the number of possible candidates with significant differences between cases and controls. Our method showed improvement over existing methods considering the first criteria. However some methods with a lower rank, e.g. Quantile, provide more number of candidates. Therefore it is required to conduct a verification experiment to confirm the true differences and discard false positives.
While the performance of the normalization method can be improved by using the scanlevel LCMS data following appropriate alignment, there are some issues. One of these issues is misalignment of the peaks across the scans. We used a simple approach to align the peaks, but more advanced techniques are available to further improve the alignment. Also, including prior distributions on the parameters of the GPRM through Bayesian analysis can potentially elevate the performance of our method. Future work will focus on addressing these issues.
Methods
Several normalization techniques have been proposed for LCMS data. As normalization is a wellknown concept in the area of genomics, most of the methods have been adapted from the techniques developed for gene expression microarray data [4, 7–10]. Usually the underlying assumption of these approaches is that the average biomolecule concentrations should be equal for all samples in the same experiment. To examine the performance of these methods, replicate LCMS runs of a reference sample can be used [5].
In this paper, we introduce a Gaussian process regression model for normalization based on analysis order. Also we extend this method to estimate variability of scanlevel ion intensities within an EIC of a peak. For comparison we investigated the following normalization methods: (i) normalization based on total ion count (TIC), (ii) median scale normalization, (iii) pretreatment methods such as scaling, centering and transformation, (iv) normalization based on internal standards, (v) quantile normalization, (vi) MA transform linear/local regression normalization, (vii) normalization based on QC consistency, (viii) normalization based on stable features, and (ix) normalization based on analysis order. We implemented these methods and in some cases we modified the algorithm [5].
As described in Background, the data matrix X_{p × n} is made of n samples and p features, where each sample is represented as a column vector x _{ j } for j = 1,..., n, so x _{ j } = [x _{1j }.. x _{ pj }]^{T} and:
Existing normalization methods
Normalization based on TIC is the most common and the simplest approach, which divides all ion intensities of a LCMS run by the area under the TIC curve, where TIC is the total energy or the sum of all intensities of related m/z values for each RT (scan):
and for the normalization factor:
where I _{ j } is the intensity of the pair (rt, m/z) for the j th sample. Here both TIC of the sample and TIC of the selected ions can be used. We preferred the latter as the former includes all the noisy ions which have been already removed in the preprocessing step.
Median scale normalization [7] considers one LCMS run as a reference, then all other LCMS runs are scaled based on the median of the ratio of all intensities to the reference:
where x _{ i } ^{∗} is the i th ion of the reference sample x^{∗}. Similarly x _{ ij } is the i th ion of the j th sample x _{ j }. We modified this method by adopting some rules to select the reference. The first option is to select one of the QC runs. In the second scenario, any sample may be selected as the reference. In both cases it is convenient to choose randomly, but we decided to include the option to select the reference sample based on minimum number of missing values/outliers, where the outliers are detected based on projection statistics.
Pretreatment methods, including three classes (class I, II, III) have been proposed for normalization of LCMS data [11]. The choice of a pretreatment method depends on the properties of the data set and the biological problem to be explored. Class I is centring by subtracting the average of each ion from all intensities of that ion to remove the offset. This is not enough when the data is from different distributions. Class II consists of five different scaling approaches (auto, range, Pareto, vast, and level scaling). Autoscaling is applied after centering by normalizing data with a scale measure s _{ i }, such as standard deviation. Additionally range scaling is defined as:
Pareto scaling tries to reduce the relative importance of large values but keep the data structure partially intact and calculated as ${\stackrel{\sim}{x}}_{ij}\phantom{\rule{2.77695pt}{0ex}}=\phantom{\rule{2.77695pt}{0ex}}\left({x}_{ij}{\overline{x}}_{i}\right)/\sqrt{{s}_{i}}$. The disadvantage of this method is its high sensitivity to outliers. If we are interested in ions with small fluctuations we may use vast scaling as:
where the coefficient of variation (CV) is defined as $C{V}_{i}\phantom{\rule{2.77695pt}{0ex}}=\phantom{\rule{2.77695pt}{0ex}}{s}_{i}/{\overline{x}}_{i}$. Level scaling focuses on relative responses and suited for identification of biomarkers:
The drawback of this method is inflation of measurement errors. In class III, usually centered log magnitude or square root of intensities is used to reduce the effect of different data distributions and make skewed distributions more symmetric, but this class has difficulties with zero values and large variances.
Normalization based on internal standards is another popular approach for LCMS data [12]. In this method by inserting one or more internal standards with controlled amounts of concentration, normalization is done based on the variation of these landmarks. If there is only one standard available, one sample is considered as the reference, then we scale all samples by the ratio of the standard's intensity of the reference to the standard's intensities of the samples. This approach can be modified by selecting a robust value for the reference to avoid accepting outliers as standard ions. If there are more than one internal standard, two approaches are feasible. First by using a distance measure we can find the closest standard to each ion and apply the previous method with one standard. Moreover it is possible to find a regression model for the variation of standards versus order of ions and apply the result to all intensities. The problem with this method is that it is expensive because it needs to add internal standards with precise concentrations in the sample preparation phase. This approach performs well when the correlation between an ion and the internal standard is not high, otherwise it does not meaningful to be used for normalization.
Quantile normalization is one of the nonparametric techniques which was first introduced for normalization of Affymetrix gene expression arrays [10]. In [4, 13], this method has been applied to proteomic data. The aim of this method is to make the distribution of intensities the same across all samples. It can be implemented in three steps by creating a mapping between ranks and values:

(1)
Find the smallest values for each vector or array. Save the average or median of these values.

(2)
Similarly, find the second smallest values, and up to the n smallest values for each vector or array. Save the averages or medians of these values.

(3)
For each vector or array, replace the sorted actual values with these averages, and resort them again.
MA transform linear regression normalization assumes that systematic bias is linearly dependent on ion intensities. First, MAtransformed scatter plots are constructed as in Figure 2 with M and A defined as in [10]:
and
where j ≠ k and j, k ∈ {1, .., n}. Then two choices are possible to proceed with normalization: (i) to consider one sample or the mean/median of all samples as the reference; (ii) to run the algorithm pairwise for all the samples until convergence, i.e. when the change of overall maximum intensity of the data set is less than a certain threshold in (7) and (8). It is common to define x _{ k } as the average (or median for the robust estimation) of all samples. Then the algorithm normalizes the log magnitude of ion intensities by subtracting the peak ratios calculated by applying least squares regression to M versus A plots:
where $\widehat{m}$ is the estimated vector from the regression model. The main idea of this method is to enforce the average log intensity difference of all peaks to zero.
MA transform local regression normalization is MA transform linear regression, but instead of using a linear model, it applies piecewise linear or other nonlinear models such as higher order polynomials and splines to find the baseline curve of intensities. Also locally weighted polynomial regression (LOESS) technique [14] has been used to smooth the log magnitude of relative ion intensities versus the reference or in a pairwise manner (Figure 2).
Normalization based on QC consistency uses a different rationale for normalization of LCMS data [15]. If a data set involves QC runs of a sample (e.g. a reference sample or a mixture of pooled samples) and the QC runs are included in between different analyses, it is reasonable to expect less variation in ion intensities for QCs compared to other samples. Then the consistency of each ion is evaluated based on two criteria. Feature cleaning accept an ion if it meet both criteria, otherwise it is rejected and will not be used in next processing levels including normalization:
where $C{V}_{i}^{QC}$ and $C{V}_{i}^{S}$ are CV of i th ion for QCs and all samples except QCs respectively (Figure 3). However it is not clear how to select TH _{1} and TH _{2} and they should be determined for each data set separately.
Normalization based on stable features [16] tries to find a subset of stable ions and then normalize based on the variation of those ions in each sample. Assuming T _{CV} as the maximum acceptable threshold, stable features are found as:
These features are used for normalization as follows:
Normalization based on analysis order is one of the most recent approaches [2]. The main idea of this method is to model the variation of intensities versus the sample's run (injection) order in the experiment, and to remove this variation by applying a smoothing regression technique. This approach needs a set of reference samples to model the variation versus analysis order based on their deviation from expected intensity values. However this method has not been examined with a large data set and only a set of technical replicates was used in the work reported in [2]. Also only animal samples with few numbers of known proteins were used in the study. The authors reported that their normalization method outperformed all other existing methods in their experiment.
This method can be expanded to include all samples in an experiment by applying the smoothing algorithm without any reference ions or samples. However, it is difficult to decide how much smoothing is enough for a particular ion. For example, it is not clear how the proper choice of span parameter can be determined for the LOESS algorithm (Figure 4). The authors proposed to use a certain range of span parameters in [0, 1] and try to find the best value by visual inspection or by using quantitative measures such as coefficient of variation (CV). In [15] cross validation methods such as leaveoneout (LOO) are recommended to select the best value for span parameter (Figure 5). It is suggested to consider a discrete subset of all possible values in [0, 1], then to perform cross validation to find the best value only on this subset which is not the optimal approach.
Proposed normalization method
One way to address the issue of selecting the smoothing parameter of the LOESS algorithm, is to use a stochastic model and use optimization to learn the parameters. We propose a stochastic model to correct for drift using the analysis order information in which the LCMS data were generated. This method uses preprocessed data to perform normalization [6]. Next, we extend the method to apply normalization using scanlevel data.
Normalization based on Gaussian process (GPRM)
Considering preprocessed data, we assume ion intensities across different runs are observations from a Gaussian random process:
for t ∈ {t _{1} .. t _{ n }} which is the analysis order index. In addition index i points to the i th ion. To use the model in (11), all we need is to find the covariance matrix of the process Σ^{(i)}.
For a given ion with intensities x(t), we consider a parametric model for estimation of covariance matrix by using a kernel function k as:
It is reasonable to consider that the drift is highly correlated for adjacent runs, as it is caused by instrument variations in ion count measurements. Also drifts in ion intensities should be less similar for distant runs. As the covariance matrix is positive definite, the kernel function should have certain properties to generate a valid covariance matrix as an estimate of the real covariance matrix. In [17] Gaussian process $\left(\mathcal{G}\mathcal{P}\right)$ is used for regression and a series of valid kernel functions are introduced. We selected the Matern kernel which enables us to control the desired smoothness properties by changing its parameter v:
where $\tau =\phantom{\rule{0.3em}{0ex}}{t}_{{j}_{1}}{t}_{{j}_{2}},\phantom{\rule{2.77695pt}{0ex}}\Gamma \left(\right)$ is the Gamma function, and K_{ v }() is the modified Bessel function (Figure 6) and we have $k\left({{t}_{j}}_{{}_{\text{1}}};{{t}_{j}}_{{\text{2}}_{}}\right)={\sigma}^{\text{2}}R\left(\tau \right)$. Here $\ell $ is the scale parameter which determines the degree of correlation between runs based on their distance in terms of analysis order and usually υ ∈ {1/ 2, 3/ 2, 5/ 2}. To consider noise in our regression model, we include an additional term to the kernel function:
where ${\sigma}_{\in}^{2}$ is the variance of the zero mean Gaussian noise.
For a given ion, we assumed intensities as observations of a $\mathcal{G}\mathcal{P}$, thus they follow a multivariate normal distribution, i.e. if $\text{x}={\left[x\left({t}_{1}\right)\phantom{\rule{0.3em}{0ex}}..x\left({t}_{n}\right)\right]}^{T}$then $\text{x}\phantom{\rule{0.3em}{0ex}}~\phantom{\rule{0.3em}{0ex}}\mathcal{N}\left(\mu ,\sum \right)$. It is reasonable to consider $\mu \left(t\right)={\mu}_{0}+{\mu}_{1}t$ as a linear function of time which explains the linear component of the drift while the nonlinear part is modelled by $\sum .$ Therefore for each ion we have:
where $\theta ={\left[\ell \sigma \phantom{\rule{0.3em}{0ex}}{\sigma}_{\epsilon}{\mu}_{0}\phantom{\rule{0.3em}{0ex}}{\mu}_{\text{1}}\right]}^{T}$ is the vector model parameters. We can estimate the parameters of the kernel from the covariance matrix by using maximum likelihood method. To maximize the likelihood function of θ, we minimize $\mathcal{L}$ as:
Any optimization method can be used to find the maximum likelihood estimator, θ^{∗}. For example, using gradient descent approach: $\theta \left(r+\text{1}\right)=\theta \left(r\right)\lambda {\nabla}_{\theta}\mathcal{L}$, where r is the iteration index and 0 < λ < 1.
Once ${\theta}_{QC}^{*}$ is derived for ${t}_{QC}=\left\{{t}_{1},..,{t}_{{n}_{QC}}\right\}$ for n _{ QC } quality control samples, we can estimate drift for t = 1, .., n. The estimated regression curve which models the variability of the QCs is used to correct the drift for other samples similar to what is performed by using the results from LOESS algorithm. A concern here is the risk of overfitting due to the presence of several parameters in the model while using only the QC runs as the reference. To address this concern, we use regularization. The regularization term is defined as a function of the scale parameter in our kernel function to reflect the consistency for measurements from both QC runs and the ones from experimental samples. Therefore, first we find the scale parameter for experimental samples by using the same approach. Then regularization is performed by applying a constraint based on scale parameters, i.e.${\ell}_{QC}>{\ell}_{S}$, and KarushKuhnTucker (KKT) approach is used to find ${\theta}_{QC}^{*}$:
So far, we explained the algorithm for a given ion. As we have multiple ions, we repeat the procedure for each ion separately to estimate ${\theta}^{\left(i\right)}={\left[{\ell}^{\left(i\right)}{\sigma}^{\left(i\right)}\phantom{\rule{0.3em}{0ex}}{\sigma}_{\in}^{\left(i\right)}\phantom{\rule{0.3em}{0ex}}{\mu}_{0}^{\left(i\right)}\phantom{\rule{0.3em}{0ex}}{\mu}_{1}^{\left(i\right)}\right]}^{T}\text{for}{x}^{\left(i\right)}~\mathcal{N}\left({\mu}^{\left(i\right)},{\text{\Sigma}}^{\left(i\right)}\right)$
Gaussian Process Regression Model  Extracted Ion Chromatogram(GPRMEIC)
All the methods introduced so far, use single peak intensity derived from each EIC (typically by calculating the area under EIC curve). However GPRMEIC takes advantage of the scan level ion intensities within an EIC (Figure 7). Specifically, GPRMEIC looks into individual scans to monitor the variations based on analysis order (Figure 8). This can model the changes observed in the intensities of scanlevel of QC samples and based on that correct for experimental samples' ion intensities. To apply the correction we used the integral of the differences of ion intensities before and after correction.
Therefore we employ the model in (11) and update that to include scanlevel information:
for t = t _{1} , .., t _{ n }, where index i points to the i th ion and s represents the scan number. To use the model in (18), similar to the previous model, all we need is to find the covariance matrix of the process Σ. Here $\ell $ and µ parameters are defined as scalars.
To summarize, first we find the base peaks for the corresponding mass of each ion to form the EIC. This can be done by using XCMS2, to find regions of interest [18] (ROI) for the ions selected in preprocessing. However we can use segmentation along mass axis as it is employed in original XCMS [19]. Thereafter, by looking into raw data, each individual scan is used to model the drift based on analysis order. The model is used to correct for the variation. Finally the normalized peak intensities are used to recalculate the area under the EIC curve and update the ion intensities. One issue here is the misalignment of the peaks. The drift in each scan may be partly due to retention time differences across samples. To correct for this we simply align the first peak in each spectrum to match the scans across different samples.
2D Gaussian Process Regression Model  Extracted Ion Chromatogram (2DGPRMEIC)
In the method above, we analyzed each scan separately. It is possible to consider a 2 dimensional $\mathcal{G}\mathcal{P}$ to combine the information from different scans for each peak from all QCs:
where z = [t s]^{T} for analysis order $t={t}_{1},..,{t}_{{n}_{QC}}$ and scan $s={s}_{1}^{\left(i\right)},..,{s}_{S\left(i\right)}^{\left(i\right)}$. Here S(i) is the number of scans for ion i. By using this model, we consider two different scales along analysis order and scan time axes and use a 2D Gaussian process to model the variability.
LCMS data
We used inhouse LCMS data set to evaluate the normalization methods described in the previous sections. The data set is derived from three types of samples, cases, controls, and QCs. The samples were collected from adult patients at Tanta University Hospital, Tanta, Egypt. The participants consist of 40 hepatocellular carcinoma (HCC) cases and 50 patients with liver cirrhosis. Through peripheral venepuncture single blood sample was drawn into 10 mL BD Vacutainer sterile vacuum tubes without the presence of anticoagulant. The blood was immediately centrifuged at 1000 × g for 10 min at room temperature. The serum supernatant was carefully collected and centrifuged at 2500 × g for 10 min at room temperature. After aliquoting, serum was kept frozen at 80 °C until use.
The samples were run in two different batches, B1 and B2, and in each batch two sets of experiments were included, the "forward" (F) and the "reverse" (R) experiments (Figure 9). The forward order experiment includes all the samples in B1, and the reverse order experiment includes the same samples as in forward experiment, but the run order is reversed. Each batch includes 10 QC samples, 20 cases and 25 controls. For each experiment, QCs were made of pooling case samples from the same batch. QCs were injected every other 5th run. For cases and controls the injection is done alternating between samples from each group to reduce bias [20].
The data were acquired using ultra performance liquid chromatography (UPLC) coupled with QTOF MS in both positive and negative modes. The raw data were preprocessed by XCMS package [19]. The details of the data set can be found in [21].
Evaluation criteria
The main assumption of any normalization method is reproducibility of the experiment. Here we assume that at least a considerably large subset of the measured values is reproducible. A measured value refers to a single peak which represents an ion intensity across different runs. Since the QC runs are expected to be identical, their measurements can be used to estimate instrument variability and to assess the reproducibility of the experiment.
Coefficient of variation (CV) is commonly used as a measure of performance, which is defined as the ratio of standard deviation to absolute mean for the i th ion: CV _{ i } = σ_{i}/µ_{i}. As there are more than one ion to evaluate, the median CV for each group of samples, or the number of ions with CV less than a certain threshold, is typically used to evaluate reproducibility. Based on this definition, we expect a decrease in CV at least for QC runs when applying any normalization method. However using CV to evaluate normalization methods is tricky, as smaller CV does not necessarily imply better normalization due to the presence of differentially abundant ions. Thus a method that ensures the efficiency in finding true biological differences is desired. In this study, we evaluate the impact of the normalization method in terms of decrease in withinQC variation in each batch of data. In addition, we use the number of statistically significant ions between cases and controls as a metric to compare normalization methods. We consider this measure only when the increase in number of detected ions is consistent between batches with dependent and independent sets of samples. As we have duplicate experiments of the same phenotypes, we evaluated the performance of the normalization methods by cross validation between different experiments. A oneway repeated measures ANOVA model is used to investigate the performance of different methods based on the variation of QCs across different experiments:
where ${x}_{ijk}^{QC}$ is the intensity of k th QC run for i th ion in batch j and ζ is the random effect so that $\forall i:{\mathbb{E}}_{k}\left[{\zeta}_{ik}\right]=0.$
A normalization method is evaluated on the basis of the number of ions with reduced variance of ζ _{ ik } . We evaluate this by using the F test for the ratio of the sum of squares from ζ to the sum of the squares of ∈ which is the unexplained variation or error. To correct for the multiple testing effect, we use q _{ ζ } < 0.1, where q is FDRadjusted pvalue estimated using the Storey method [22].
Furthermore the number of statistically significant ions in each data set is compared for each dataset before and after normalization. A twoway repeated measures ANOVA is used to analyze the combined forward and reverse experiments (e.g. Exp 1F & Exp 1R):
for ion i in sample l from group j in batch k. The group effect α, and the batch effect β, are considered in the model as well as the possible groupbatch interactions γ while ξ models the within sample variation. Also a twoway ANOVA is used to analyze the betweenbatch combinations (e.g. Exp 1F & Exp 2F):
ions with significant groupbatch interaction, i.e. q _{γ} < 0.1, were removed from the analysis, where q is FDRadjusted pvalue estimated using the Storey method [22]. Significant ions are selected based on q _{ i,α } ≤ 0.1.
References
 1.
Listgarten J, Emili A: Statistical and computational methods for comparative proteomic profiling using liquid chromatographytandem mass spectrometry. Mol Cell Proteomics 2005, 4: 419–434. 10.1074/mcp.R500005MCP200
 2.
Kultima K, Nilsson A, Scholz B, Rossbach UL, Flth M, Andrn PE: Development and Evaluation of Normalization Methods for Labelfree Relative Quantification of Endogenous Peptides. Molecular and Cellular Proteomics 2009,8(10):2285–2295. 10.1074/mcp.M800514MCP200
 3.
Tuli L, Ressom HW: LCMS Based Detection of Differential Protein Expression. Journal of proteomics bioinformatics 2009,2(10):416–438. 10.4172/jpb.1000102
 4.
Callister SJ, Barry RC, Adkins JN, Johnson ET, Qian WJ, WebbRobertson BJM, Smith RD, Lipton MS: Normalization Approaches for Removing Systematic Biases Associated with Mass Spectrometry and LabelFree Proteomics. J Proteome Res 2006,5(2):277–286. 10.1021/pr050300l
 5.
Nezami Ranjbar M, Zhao Y, Tadesse M, Wang Y, Ressom H: Evaluation of normalization methods for analysis of LCMS data. Bioinformatics and Biomedicine Workshops (BIBMW), 2012 IEEE International Conference on 2012, 610–617.
 6.
Nezami Ranjbar M, Tadesse M, Wang Y, Ressom H: Normalization of LCMS data using Gaussian process. Genomic Signal Processing and Statistics, (GENSIPS), 2012 IEEE International Workshop on 2012, 187–190.
 7.
Yang YH, Dudoit S, Luu P, Lin DM, Peng V, Ngai J, Speed TP: Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Research 2002,30(4):e15. 10.1093/nar/30.4.e15
 8.
Huber W, von Heydebreck A, Sültmann H, Poustka A, Vingron M: Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics 2002,18(1):S96S104.
 9.
Anderle M, Roy S, Lin H, Becker C, Joho K: Quantifying reproducibility for differential proteomics: noise analysis for protein liquid chromatographymass spectrometry of human serum. Bioinformatics 2004,20(18):3575–3582. 10.1093/bioinformatics/bth446
 10.
Bolstad BM, Irizarry RA, Åstrand M, Speed TP: A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 2003,19(2):185–193. 10.1093/bioinformatics/19.2.185
 11.
van den Berg R, Hoefsloot H, Westerhuis J, Smilde A, van der Werf M: Centering, scaling, and transformations: improving the biological information content of metabolomics data. BMC Genomics 2006, 7: 142+. 10.1186/147121647142
 12.
SysiAho M, Katajamaa M, Yetukuri L, Oresic M: Normalization method for metabolomics data using optimal selection of multiple internal standards. BMC Bioinformatics 2007, 8: 93. 10.1186/14712105893
 13.
Higgs RE, Knierman MD, Gelfanova V, Butler JP, Hale JE: Comprehensive labelfree method for the relative quantification of proteins from biological samples. Journal of Proteome Research 2005,4(4):1442–1450. 10.1021/pr050109b
 14.
Cleveland WS: Robust Locally Weighted Regression and Smoothing Scatterplots. Journal of the American Statistical Association 1979,74(368):829–836. 10.1080/01621459.1979.10481038
 15.
Dunn WB, Broadhurst D, Begley P, Zelena E, FrancisMcIntyre S, Anderson N, Brown M, Knowles JD, Halsall A, Haselden JN, Nicholls AW, Wilson ID, Kell DB, Goodacre R: Procedures for largescale metabolic profiling of serum and plasma using gas chromatography and liquid chromatography coupled to mass spectrometry. Nature Protocols 2011,6(7):1060–1083. 10.1038/nprot.2011.335
 16.
Kamleh MA, Ebbels TMD, Spagou K, Masson P, Want EJ: Optimizing the Use of Quality Control Samples for Signal Drift Correction in LargeScale Urine Metabolic Profiling Studies. Analytical Chemistry 2012,84(6):2670–2677. 10.1021/ac202733q
 17.
Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning series). The MIT Press; 2005.
 18.
Benton HP, Wong DM, Trauger SA, Siuzdak G: XCMS2: Processing Tandem Mass Spectrometry Data for Metabolite Identification and Structural Characterization. Analytical Chemistry 2008,80(16):6382–6389. [PMID: 18627180] 10.1021/ac800795f
 19.
Smith CA, Want EJ, OMaille G, Abagyan R, Siuzdak G: XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. Analytical Chemistry 2006,78(3):779–787. 10.1021/ac051437y
 20.
Nezami Ranjbar MR, Wang Y, Ressom HW: Quality assessment of LCMS metabolomic data. Bioinformatics and Biomedicine Workshops (BIBMW), 2011 IEEE International Conference on 2011, 1034–1036.
 21.
Xiao JF, Varghese RS, Zhou B, Nezami Ranjbar MR, Zhao Y, Tsai TH, Di Poto C, Wang J, Goerlitz D, Luo Y, Cheema AK, Sarhan N, Soliman H, Tadesse MG, Ziada DH, Ressom HW: LCMS Based Serum Metabolomics for Identification of Hepatocellular Carcinoma Biomarkers in Egyptian Cohort. Journal of Proteome Research 2012,11(12):5914–5923.
 22.
Storey JD: A direct approach to false discovery rates. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 2002,64(3):479–498. 10.1111/14679868.00346
Acknowledgements
This work was supported in part by the National Institutes of Health Grant R01CA143420. The LCMS data presented in the manuscript were generated through support from the Proteomics and Metabolomics Shared Resource at the Lombardi Comprehensive Cancer Center.
Declarations
The publication costs for this article were funded by the corresponding author.
This article has been published as part of Proteome Science Volume 11 Supplement 1, 2013: Selected articles from the IEEE International Conference on Bioinformatics and Biomedicine 2012: Proteome Science. The full contents of the supplement are available online at http://www.proteomesci.com/supplements/11/S1.
Author information
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors' contributions
MRNR designed the study, analyzed and interpreted the data, participated in the statistical analysis and drafted the manuscript. YZ participated in the statistical analysis. YW and MGT drafted the manuscript. HWR guided the study and drafted the manuscript. All authors read and approved the final manuscript.
Rights and permissions
About this article
Published
DOI
Keywords
 Extracted ion chromatogram (EIC)
 Evaluation
 Gaussian process
 Liquid chromatographymass spectrometry (LCMS)
 Normalization
 Quality control (QC)
 Scanlevel data