Following is a program that estimates under-methylated regions in an unspecified DNA sequence, as described in Straussman et al’s paper (Algorithm-3). The program works by using a moving window (the size of which can be specified by the user) and by assessing the likelihood of that window to be under-methylated. As a final step windows passing the threshold (as specified in Algorithm-3 in the manuscript) are joined to form contiguous under-methylated regions.
Parameters:
fastaFileName The name of the DNA sequences file. The file should be in a FASTA format. An example file can be found here.
windowSize The size of the moving window (default is 200, which was used in the paper).
motifFileName The name of the motif file. Indicates the motif PSSM used by the program to calculate the motif hit frequencies in the sequence. In addition for each motif, list the relevant weight assigned in the simpleLogistic formula. To use Algorithm-3 as described in the manuscript do not change the file content.
Other parameters and their default are explained inside the parameters file.
Output:
After running the application, the following files will be generated by the software:
“fastaFileName”.dens The results of the analysis, with a ".dens" suffix. For each sequence, from the input file, the regions predicted to be under methylated are listed. An example output file, corresponding to the example input file given above, can be found here.
Log.txt A log file of the application. Used for debugging purposes.
When using this software please cite:
Straussman R, et al, “Developmental programming of CpG island methylation profiles in the human genome.” Nat Struct Mol Biol. 2009;16(5):564-71.
For questions and support please contact Israel Steinfeld (israels AT cs.technion.ac.il).