Description:
-------------
This software package implements the Discriminative Component Analysis 
(DCA) method described in the paper Discriminative Components of Data, 
with the difference that in this software package, conjugate gradient 
optimization is used instead of the original stochastic gradient 
optimization. When you use the code, please cite the following paper:

   Jaakko Peltonen and Samuel Kaski. Discriminative Components of
   Data. IEEE Transactions on Neural Networks, 16:68-83, 2005.

This is experimental software provided as is; we welcome any comments 
and corrections but cannot give any guarantees about the code.

If you have any questions, suggestions, or bug reports, please direct 
them to: 
   Jaakko Peltonen
   Aalto University School of Science and Technology,
     Department of Information and Computer Science,
     P.O. Box 15400, FI-00076 Aalto, FINLAND
   email jaakko.peltonen@tkk.fi 
   webpage http://www.cis.hut.fi/jtpelto/

Copyright Jaakko Peltonen, Janne Sinkkonen and Samuel Kaski.

This code uses an implementation of conjugate gradient based on the
description in J. Shewchuk, An Introduction to the Conjugate Gradient
Method Without the Agonizing Pain, 1994.


License:
---------
This software package is licensed under the GNU Lesser General Public 
License version 2.1; see the file LICENSE for the full license terms.


Requirements:
--------------
This software package is implemented as a Python-language script which
calls a C-language library.

To use this software package, you will need a C compiler, an
installation of Python (available from www.python.org), and the
Numeric package (available from http://numpy.scipy.org).


Installation:
--------------
Simply compile the C-language library by typing 'make' in the command
line.

The makefile is meant for Linux/Unix systems, but it should be
possible to adjust the compilation for a Windows environment.



Usage:
-------
The program is run from the command line as follows (all on one line):

    python learn_dca.py input_datafile n_input_dim n_classes output_filename n_output_dim random_seed n_of_iterations n_of_secants pretransformation init_projection sigma use_angle_reparameterization

Here the parameters are:
    input_datafile    A text file containing the data points as a matrix
                      of space separated values. Each row is one data
                      point. The first columns are the input dimensions
                      (n_input_dim columns); the last columns are binary
                      class indicators (n_classes columns) where each column
                      indicates whether the data point belongs to that class.
    n_input_dim       How many dimensions the data has, not counting the
                      class index.
    n_classes         How many different classes there are in the data.
    output_filename   The projection matrix will be written to this text file.
                      The output will be a matrix of size (n_input_dim rows,
                      n_output_dim columns), with space separated values.
    n_output_dim      How many dimensions the projected data should have.
    random_seed       Value of a random seed. Currently not used.
    n_of_iterations   Number of conjugate gradient iterations to use in the
                      optimization.
    n_of_secants      Number of secant steps to use in each conjugate gradient
                      iteration.
    pretransformation Initialization file containing an initial transformation
                      matrix applied to the data: this must be a full square
                      (n_input_dim by n_input_dim) matrix.
    init_projection   The initial projection matrix which is used as the 
                      starting point of the optimization. The matrix should 
                      have n_input_dim rows and n_output_dim columns, and 
                      should have space separated values.
    sigma             Sigma (Gaussian standard deviation) used in the
                      nonparametric class density estimate. This should be
                      a positive value: roughly speaking, larger values yield
                      'softer' estimates that change more slowly between
                      points. See the Peltonen and Kaski 2005 paper.
    use_angle_reparameterization   Whether to learn an orthogonal projection 
                      (give value 1 for this parameter) or an unrestricted 
                      projection (give value 0).

The script learn_dca.sh gives an example of how to run the DCA code. A
simple toy data set is used where there are 10 classes, and only
directions 3 and 4 affect the conditional probability of the
classes. The file mydata.txt has the data,
initial_pretransformation.txt and initial_projection.txt have the
initializations, and make_toydata.m is a Matlab script used to
generate the data set and the initialization files.

