Code for CMF (contrast motif finder) can be found at http://www.stat.ucla.edu/~zhou/CMF/

CMF was created for Unix machines but should work on Linux and Windows sytstems too. 
To compile CMF, ensure that files cmf.cpp and cmf.h are in the same folder and type
g++ cmf.cpp -o cmf  
 
after compiling, make sure the cmf is executable using "ls -l cmf" and "chmod" if necessary.

RUNNING CMF:
CMF is run from the command line with the following options provided (just type ./cmf to see these options):
w   the length of the motif seed, default = 7
m   numer of mismatches in seed, default = 2
F   the FDR level for determining the LR cutoff for identifying TF binding sites (default = 0.667)
t   the number of top seeds to test, default = 10
i1  first set of sequences, should be fasta formatted
i2  second set of sequences, should be fasta formatted
d   1: enrichment only in i1 (traditional scenario), 2: enrichment in both datasets (contrasting scenario), default = 1
l   lower bound on length of motifs, default = 5
u   upper bound on length of motifs, default = 20
o   output of seed statistics
f   folder for all other output
c   c/g content filter, if > 0 filter out seeds based on cg content (ex -c 4, filter out seeds with more than 4 C or G's, default = 0)

Input files must be in the FASTA format and repeat regions should be masked with "N" in desired since lower case nucleotides are converted to 
upper case. If the input files differ  greatly in c/g content we recommend considering the -c option (this should be done with care since some 
motifs are c/g rich, ex Klf4).

Below is an example of how to find the Oct4 motif in the traditional scenario (i.e. enriched in a bound set of sequences as compared to a control) 
./cmf -w 7 -m 2 -d 1 -t 50 -i1 youngOct4Bound.txt -i2 youngOct4Control.txt -o seedsInfo.txt -f outputFolder/ 

To contrast two sets of bound sequences simply change the -d option to 2 and use appropriate sequences (see below) 
./cmf -w 7 -m 2 -d 2 -t 50 -i1 youngOCT4wSOX2.txt  -i2 youngOCT4woSOX2.txt  -o seedsInfo.txt -f outputFolder/ 

OUTPUT:
Output will be in the folder specified by option 'f' in a file called "output.txt".
For each seed successfully updated into a motif, the file contains:

The consensus motif
The initial seed
The m flexible positions in the seed
The likelihood threshold
The t-score
The enrichment (log2)
The PWM

Additionally, positive and negative motif sites are given,
for each site the 
1) sequence name
2) location in sequence
3) likelihood ratio score
are given in a tab delimited format. On the next line the actual wmer at that site is given.