Finding approximate palindromes in genomic sequences.

by Tlalit Freund and  Noga Engel.

October 10, 2003, under guidance of Doron Lipson and Zohar Yakhini

 


Background

In this section we defined different types of palindromes and gave an example for each type.

We also found important biological roles, in which palindromes take place.

palindrome background -   

 


Problem

        Implementation of an algorithm for finding approximate palindromes in gnomic sequences. 

        Usage of the algorithm for purposes of creating palindrome "fingerprints".

   Develop methods for testing the significance of specific approximate palindromes.


Implementation

The program goes through the DNA sequence, searching for all palindromes in a specific length, a range of allowed mismatches and a range of allowed gaps.

The Algorithm -  

Running example  -    

We build a statistical model to calculate our results expectation on random (non gnomic) sequence

By using these statistical models we can evaluate the significance of the palindromes we have found.

Statistical models  -  


Results

In order to check if the algorithm we wrote is accurate, and works according to our demands, we compared our formula expectation, with the average result of a large bunch of random sequences.

Testing our statistical models -     

We compared our formulas expectation with our program results on different types of gnomic sequences and evaluate these results.

Genome from different Viruses -  

Genome from different Bcteriophages -  

Different organisms -  

Histones H4 family -  

 


Downloads

Palindrome application - Find pal's in a nucleotide sequence:   last_ver_pal.c

Expectation calculating - formula a:  palindrome.m (matlab file)

Expectation calculating - formula b: pal_acgt.m (matlab file)

Presentation: presentation.ppt  

 


                                

Run our program in the following link -  

 


References