| RPA.iteration {RPA} | R Documentation |
Finds a mode for the model parameters d (estimated true signal underlying probe-level observations), and sigma2 (probe-specific variances).
RPA.iteration(S, sigma2.guess, epsilon = 10^(-3),
alpha = NULL, beta = NULL,
sigma2.method = "var", d.method = "fast")
S |
Matrix of probe-level observations for a single probeset: samples x probes. |
sigma2.guess |
Initial values for probe-specific variances. |
epsilon |
Convergence tolerance. The iteration is deemed converged when the change in all parameters is < epsilon. |
alpha, beta |
Vectors that specify priors for inverse Gamma distribution of probe-specific variances. Noninformative prior is obtained with alpha, beta -> 0. Used only when sigma2.method = "basic". |
sigma2.method |
Optimization method for sigma2 (probe-specific variances).
"basic": optimization using user-specified alpha, beta priors. Default: alpha, beta = 1e-6. "var": utilizes the fact that the cost function converges to variance with large sample sizes. Default method. |
d.method |
Method to optimize d.
"basic": optimization scheme to find a mode used in Lahti et al. TCBB/IEEE; relatively slow; this is the preferred method with small sample sizes. "fast": weighted mean over the probes, weighted by probe variances The solution converges to this with large sample size. Default method. |
Assuming data set S with P observations of signal d with Gaussian noise that is specific for each observation (specified by a vector sigma2 of length P), this method gives a point estimate of d and sigma2. Note that probe-level variance priors alpha, beta are used only when sigma2.method = "basic". The sigma2.method = "var" assumes non-informative priors. The d.method = "fast" is the preferred method for point computing point estimates when sample size is large. It computes the average over probe-level observations, weighted by the inverse probe-specific variances, and is expected to be more robust and faster than d.method="basic" that finds point estimate for d by directly optimizing the posterior distribution.
A list with the following elements:
d |
A vector. Estimated 'true' signal underlying the noisy probe-level observations. |
sigma2 |
A vector. Estimated variances for each measurement (or probe). |
Leo Lahti <leo.lahti@tkk.fi>
Probabilistic Analysis of Probe Reliability in Differential Gene Expression Studies with Short Oligonucleotide Arrays. Lahti et al., TCBB/IEEE, to appear. See http://www.cis.hut.fi/projects/mi/software/RPA/
require(affydata)
data(Dilution) # load affybatch
# select array against which differential expression is computed the
# method has appeared to be robust for the choice of cind
cind <- 1
# Preprocess probe-level data
Smat <- RPA.preprocess(Dilution,cind)
# Pick probe-level data for one probe set
set <- "1000_at"
pmindices <- pmindex(Dilution, set)[[1]]
S <- t(Smat$fcmat[pmindices, ])
# Set initial values for probe-specific variances
sigma2.guess <- rep(10,ncol(S))
# assume non-informative prior and large sample size
# and se fast optimization
res <- RPA.iteration(S, sigma2.guess,
epsilon = 10^(-3),
sigma2.method = "var",
d.method = "fast")
par(mfrow=c(1,2))
barplot(res$d,names.arg=paste("Array",seq(ncol(exprs(Dilution)))[-cind]),
main="d",xlab="Arrays/samples",ylab="Differential expression",las=2)
barplot(res$sigma2,las=2,names.arg=paste(set,seq(ncol(S))),
main="sigma2",xlab="Probes",ylab="Variance",las=2)
# Generate toy data where the ground truth is known
# 300 arrays x 11 probes
set.seed(24)
d.true <- rnorm(300,0,4)
sigma2.true <- c(1,1,1,1,2,2,3,3,3,4,6)
S <- NULL
for (s2 in sigma2.true) {
S <- cbind(S,rnorm(length(d.true),mean=d.true,sd=sqrt(s2)))
}
# Initial guess for sigma2
sigma2.guess <- rep(1,ncol(S))
# assuming non-informative priors and large sample size
res <- RPA.iteration(S, sigma2.guess,
epsilon = 10^(-3),
sigma2.method = "var",
d.method = "fast")
barplot(res$sigma2,ylab="Estimated var", xlab="Probes",main="sigma2")