README for GFAtoxgen Author: Tommi Suvitaival, tommi.suvitaival@aalto.fi 16.4.2014 LICENSE This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see . CITING When you use this pipeline, please cite the following paper: Tommi Suvitaival, Juuso A. Parkkinen, Seppo Virtanen, and Samuel Kaski. Cross-organism toxicogenomics with group factor analysis. Submitted. When you use group factor analysis (GFA), please cite the publication referred to in the CCAGFA package, http://research.ics.aalto.fi/mi/software/CCAGFA/ . CONTENTS demo-exploratory_analysis.R - Demo script for running group factor analysis for a toxicogenomic data set and for analyzing the result COPYING.txt - License README.txt - This file sGFA.R - Group factor analysis source code script-load_camda13_data.R - Pipeline script for loading the toxicogenomic data set DOCUMENTATION This package implements the group factor analysis (GFA) with element-wise sparsity for factors and factor loadings, and presents a pipeline for cross-organism toxicogenomics. The package can be downloaded from http://research.ics.aalto.fi/mi/software/GFAtoxgen/ . To run the cross-organism toxicogenomics demo: 1) Download and decompress the following data files from the CAMDA 2013 website: A) "TGP drug info and pathological findings (CSV, EXCEL format)". Available at: http://www.bioinf.jku.at/research/camda2013/tgp_info.zip B) "Study – rat in vivo single (CSV format), Collapsed replicates (19MB) 2088 samples, 12088 genes". Available at: http://www.bioinf.jku.at/research/camda2013/rat_invivo_single_collapsed_farms.zip C) "Study – rat in vitro single (CSV format), Collapsed replicates (13MB) 1570 samples, 18988 genes". Available at: http://www.bioinf.jku.at/research/camda2013/rat_in_vitro_collapsed_farms.zip D) "Study – human in vitro (CSV format), Collapsed replicates (8MB) 714 samples, 18988 genes" Available at: http://www.bioinf.jku.at/research/camda2013/human_in_vitro_collapsed_farms.zip 2) Run the data preparation script 'script-load_camda13_data.R' provided in this package. A) Define the path of the acquired data files as the working directory of R. B) Save the output variable 'data.camda.collapsed' into the subfolder 'data' of this package 3) Run the demo 'demo-exploratory_analysis.R'. A) Define the path of this package as the working directory of R.