Title: | Multi-Dimensional MA Normalization for Plate Effect |
---|---|
Description: | Normalize data to minimize the difference between sample plates (batch effects). For given data in a matrix and grouping variable (or plate), the function 'normn_MA' normalizes the data on MA coordinates. More details are in the citation. The primary method is 'Multi-MA'. Other fitting functions on MA coordinates can also be employed e.g. loess. |
Authors: | Mun-Gwan Hong |
Maintainer: | Mun-Gwan Hong <[email protected]> |
License: | GPL-3 |
Version: | 0.8.0 |
Built: | 2024-10-31 19:47:23 UTC |
Source: | CRAN |
Normalize data to minimize the difference among the subgroups of the samples generated by experimental factor such as multiple plates (batch effects)
- the primary method is Multi-MA, but other fitting function, f in manuscript (e.g. loess) is available, too.
This method is based on the assumptions stated below
The geometric mean value of the samples in each subgroup (or plate) for a single target is ideally same as those from the other subgroups.
The subgroup (or plate) effects that influence those mean values for multiple observed targets are dependent on the values themselves. (intensity dependent effects)
normn_MA(mD, expGroup, represent_FUN= function(x) mean(x, na.rm= T), fitting_FUN= NULL, isLog= TRUE)
normn_MA(mD, expGroup, represent_FUN= function(x) mean(x, na.rm= T), fitting_FUN= NULL, isLog= TRUE)
mD |
a |
expGroup |
a |
represent_FUN |
a |
fitting_FUN |
|
isLog |
TRUE or FALSE, if the normalization should be conducted after log-transformation. The affinity proteomics data from suspension bead arrays is recommended to be normalized using the default, |
The data after normalization in a matrix
Mun-Gwan Hong <[email protected]>
Hong M-G, Lee W, Pawitan Y, Schwenk JM (201?) Multi-dimensional normalization of plate effects for multiplexed applications unpublished
data(sba) B <- normn_MA(sba$X, sba$plate) # Multi-MA normalization # MA-loess normalization B <- normn_MA(sba$X, sba$plate, fitting_FUN= function(m_j, A) loess(m_j ~ A)$fitted) # weighted linear regression normalization B <- normn_MA(sba$X, sba$plate, fitting_FUN= function(m_j, A) { beta <- lm(m_j ~ A, weights= 1/A)$coefficients beta[1] + beta[2] * A }) # robust linear regression normalization if(any(search() == "package:MASS")) { # excutable only when MASS package was loaded. B <- normn_MA(sba$X, sba$plate, fitting_FUN= function(m_j, A) { beta <- rlm(m_j ~ A, maxit= 100)$coefficients beta[1] + beta[2] * A }) }
data(sba) B <- normn_MA(sba$X, sba$plate) # Multi-MA normalization # MA-loess normalization B <- normn_MA(sba$X, sba$plate, fitting_FUN= function(m_j, A) loess(m_j ~ A)$fitted) # weighted linear regression normalization B <- normn_MA(sba$X, sba$plate, fitting_FUN= function(m_j, A) { beta <- lm(m_j ~ A, weights= 1/A)$coefficients beta[1] + beta[2] * A }) # robust linear regression normalization if(any(search() == "package:MASS")) { # excutable only when MASS package was loaded. B <- normn_MA(sba$X, sba$plate, fitting_FUN= function(m_j, A) { beta <- rlm(m_j ~ A, maxit= 100)$coefficients beta[1] + beta[2] * A }) }
The data that has similarity to Suspension bead arrays data.
data(sba)
data(sba)
A list that consists of "plate"
which is a factor
of plate number,
"X"
that contains measured values where columns are targets and rows are samples (or observations).
data(sba) # plot to check difference of geometric mean of every target between plates sba_gm <- by(sba$X, sba$plate, apply, 2, function(x) exp(mean(log(x)))) par(mfrow= c(2, 3)) apply(combn(4, 2), 2, function(ea) { plot(sba_gm[[ea[1]]], sba_gm[[ea[2]]], xlab= names(sba_gm)[ea[1]], ylab= names(sba_gm)[ea[2]], log= "xy", asp= 1) abline(0, 1, col= "cadetblue") }) # show first 10 observations in plate 1 and plate 2 print(sba$X[c(1:10, 97:106), 1:10])
data(sba) # plot to check difference of geometric mean of every target between plates sba_gm <- by(sba$X, sba$plate, apply, 2, function(x) exp(mean(log(x)))) par(mfrow= c(2, 3)) apply(combn(4, 2), 2, function(ea) { plot(sba_gm[[ea[1]]], sba_gm[[ea[2]]], xlab= names(sba_gm)[ea[1]], ylab= names(sba_gm)[ea[2]], log= "xy", asp= 1) abline(0, 1, col= "cadetblue") }) # show first 10 observations in plate 1 and plate 2 print(sba$X[c(1:10, 97:106), 1:10])