Instructions for using the heterogeneity test

I am currently only distributing the original Unix version of this software, as other implementations were dependent on old versions of OSX.

The first step in running the program is saving the  'make_trees.txt' file you downloaded as 'make_trees.pl'.

In order to run the program you need to supply four simple numbers:

1. Number of sampled sequences/chromosomes (designated "s")
2. Number of segregating sites of your putatively neutral class (Synonymous mutations would be an obvious choice, but you could use others like non-binding sites in promoters; designated "mut_1")
3. Number of seg. sites of your selected class (usually nonsynonymous, but could be e.g. binding sites; designated "mut_2")
4. Difference in D values.  The perl program will run with either Tajima's D or Fu and Li's D, and you need to get those values for your neutral and selected classes separately and then the difference = (Dneutral - Dselected). (designated "observed")

All of this was spelled out in the old Mac OSX version, but you need to know these tags to run it at the command line.

So the simple command line argument looks like this ("i" is the number of iterations you want it to run):

perl make_trees.pl  -s 10  -mut_1 10  -mut_2 5  -observed 1.2  -i 1000  -method tajimaD

The other method is called "fuD".  The output line will be a bit confusing at first.  The thing you want to know is the "index".  This tells you where in the distribution your observed is closest to (actually, just smaller than).  One output line from the above input looks like this:

index 941 value=1.2099 out of 1000 total (obs=1.2000)

So the p-value is [(total-index)/total] (=.059) for a one-tailed test. Or, if you get a value in the other extreme (i.e. the reported "index" is close to zero), the p-value would be [(index/total)].
 

Make sure you read the paper and know what the program does before running it!