README for GFAtoxgen

Author: Tommi Suvitaival, tommi.suvitaival@aalto.fi
16.4.2014


LICENSE

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program.  If not, see <http://www.gnu.org/licenses/>.


CITING

When you use this pipeline, please cite the following paper:

Tommi Suvitaival, Juuso A. Parkkinen, Seppo Virtanen, and Samuel Kaski. Cross-organism toxicogenomics with group factor analysis. Submitted. 

When you use group factor analysis (GFA), please cite the publication referred to in the CCAGFA package, http://research.ics.aalto.fi/mi/software/CCAGFA/ . 


CONTENTS

demo-exploratory_analysis.R - Demo script for running group factor analysis for a toxicogenomic data set and for analyzing the result
COPYING.txt - License
README.txt - This file
sGFA.R - Group factor analysis source code
script-load_camda13_data.R - Pipeline script for loading the toxicogenomic data set


DOCUMENTATION

This package implements the group factor analysis (GFA) with element-wise sparsity for factors and factor loadings, and presents a pipeline for cross-organism toxicogenomics. The package can be downloaded from http://research.ics.aalto.fi/mi/software/GFAtoxgen/ .

To run the cross-organism toxicogenomics demo:


1) Download and decompress the following data files from the CAMDA 2013 website:

   A) "TGP drug info and pathological findings (CSV, EXCEL format)". Available at: http://www.bioinf.jku.at/research/camda2013/tgp_info.zip

   B) "Study – rat in vivo single (CSV format), Collapsed replicates (19MB) 2088 samples, 12088 genes". Available at: http://www.bioinf.jku.at/research/camda2013/rat_invivo_single_collapsed_farms.zip

   C) "Study – rat in vitro single (CSV format), Collapsed replicates (13MB) 1570 samples, 18988 genes".  Available at: http://www.bioinf.jku.at/research/camda2013/rat_in_vitro_collapsed_farms.zip

   D) "Study – human in vitro (CSV format), Collapsed replicates (8MB) 714 samples, 18988 genes" Available at: http://www.bioinf.jku.at/research/camda2013/human_in_vitro_collapsed_farms.zip
 
2) Run the data preparation script 'script-load_camda13_data.R' provided in this package.

   A) Define the path of the acquired data files as the working directory of R.
   B) Save the output variable 'data.camda.collapsed' into the subfolder 'data' of this package

3) Run the demo 'demo-exploratory_analysis.R'. 

   A) Define the path of this package as the working directory of R.