cd-CAP

System Requirements

  • make (version 3.81 or higher)
  • g++ (GCC version 4.1.2 or higher)
  • IBM ILOG CPLEX Optimization Studio

Compiling cd-CAP

In the Makefile, set CPLEXROOT to the path of your root CPLEX folder.

The Makefile is set-up for GCC 6.2. If you are using GCC version 4.x, add ` std=gnu++0x flag to CCC` in the Makefile.

Simply run make command in the root cd-CAP folder. It will create the executables.

To compile only the mcsi binary (single network mode of cd-CAP), in case that your system does not have CPLEX available, run make only_mcsi command.

Running mcsc and mcsi

Usage:

./mcsc -n [network] -l [alteration profiles] -c [chromosome information; optional] -r [min number of colours in subnetwork] -x [exclude genes; optional] -s [maximum subnetwork size] -t [minimum subgraph recurrence]  -k [number of subnetworks] -e [error; optional] -f [outputFolder] -d [threads] -t [time limit in seconds]
./mcsi -p (for p value simulation; optional) -n [network] -l [alteration profiles] -r [color options in subnetwork] -s [maximum subnetwork size] -t [minimum subgraph recurrence] -e [error; optional]
Parameters Description for MCSC Description for MCSI
-n network file network file
-l alterations file alterations file
-c (optional) gene-to-chromosome map N/A
-x (optional) Excluded genes N/A
-f output folder name N/A (mcsi has a single output file)
-r minimum number of colors in each subnetwork color requirement of the maximum subnetwork
-s maximum subnetwork size maximum subnetwork size
-t minimum sample recurrence minimum sample recurrence
-k number of resulting subnetworks N/A
-e (optional) allowed extension error rate (optional) allowed extension error rate
-d number of threads used for ILP solver N/A
-h time limit in seconds for ILP solver N/A
-p N/A (optional, without arguments) p-value simulation mode

-n :    This parameter represents an edge collection file where each row represents an edge in form of two node names, separated by whitespace. All edges are treated as undirected. There is no header row. e.g.

    A1BG    CRISP3
    A1CF    APOBEC1
    A2M     ABCA1
    ...

-l :    This parameter represents a file containing information about alterations in all the input samples, in form of “SampleID Gene AltType” rows. Currently, up to 64 different alteration types are supported (the third column). There is no header row. e.g.

    T294    CCNL2   SNV
    T294    PTCHD2  SNV
    T294    COL16A1 SNV
    ...

-c :    This optional parameter is a gene-chromosome map, allowing for more information in the output.

    Gene    Chromosome      KaryotypeBand
    ADAM30  1       p12
    HAO2    1       p12
    HMGCS2  1       p12
    ...

-x :    This optional parameter represents a file containing a list of genes whose colors should be removed after reading the input.

    Gene1
    Gene2
    Gene3
    ...

-f :    This parameter contains the name of the output folder for the run of the program, in which all the output files will be stored. This folder name will have values of parameters below appended to it.

-r :    This integer parameter controls the minimum required number of colors among the nodes of each resulting subnetworks, i.e. how “colorful” a subnetwork must be. Keep the value set to 1 for default configuration.

-s :    This integer parameter controls the maximum subnetwork size. For the first time running the program on a new dataset, 10 could be a reasonable value.

-t :    This integer parameter controls the minimum required sample recurrence of each resulting subnetwork.

-k :    This integer parameter controls the number of subnetworks that we wish to detect.

-e :    This optional floating type parameter controls the maximum allowed error rate when extending subnetworks before the optimization. If not specified, it defaults to 0.

-t :    This integer parameter controls the minimum required sample recurrence of each resulting subnetwork.

-d :    This integer parameter specifies the number of threads used for the optimization.

-h :    This integer parameter specifies the number of seconds that the optimization step is allowed to take before returning a solution.

-p :    Used for p-value simulation.

Example

./mcsc -n ../data/STRING10_HiConf_PPI.edges -l ../data/alteration_status_COAD_20171108.tsv -c ../data/string10_node_chromosome_map.tsv -r 1 -s 10 -t 138 -k 100 -e 0 -d 32 -h 36000 -f TCGA_COAD