specificVar {drCCA}R Documentation

Data-specific variation retained in the combined drCCA representation

Description

A function for estimating the amount of data-set specific variation (i.e. variation that is not present in any of the other data sets) retained in the combined data set of given dimensionality.

Usage

specificVar(datasets,regcca,dim,pca=FALSE)

Arguments

datasets A list containing the data matrices to be combined. Each matrix needs to have the same number of rows (samples), but the number of columns (features) can differ. Each row needs to correspond to the same sample in every matrix.
regcca Output of regCCA function, containing the solution of the generalized CCA.
dim The number of dimensions of projected data to be used
pca A logical variable with default value FALSE. If the value is TRUE, the data-specific variation will also be calculated for the PCA projected data, where PCA is performed on the columnwise concatenation of the given data sets.

Details

The function estimates the amount of data-specific information retained in a previously calculated drCCA solution. The function uses SVD to estimate the variance of each data set in the drCCA projection of the given dimensions. Data-specific variance is defined as the sum of singular values for the covariance matrix of a data set. The value is normalized so that the variation for each of the original data sets is 1. The average of the data-specific variances in the projection is also calculated. A solution truly focusing on the dependencies usually has a value that grows roughly linearly when the number of dimensions is increased. The function can also be used to estimate the same quantity for simple PCA projection of the concatenation of the data sets. This can be used as a comparison value. For details, please check the reference.

Value

The function returns a list of following values

cc Data Specific variation for a drCCA projection of given number of dimensions
pc A vector containing the data-specific variations for a PCA projection of given dimensions, if pca = TRUE is given
mcca Mean of data-specific variations for a drCCA projection
mpca Mean of data-specific variation for a PCA projection, if pca = TRUE is given

Author(s)

Abhishek Tripathi, Arto Klami

References

Tripathi A., Klami A., Kaski S. (2008), Simple integrative preprocessing preserves what is shared in data sources, BMC Bioinformatics.

See Also

sharedVar

Examples


       data(expdata1)
       data(expdata2)
       r <- regCCA(list(expdata1,expdata2))

       specificVar(list(expdata1,expdata2),r,4)


[Package drCCA version 1.0 Index]