Project in
Computational Biology
Clusters of Orthologous Genes
(COGs)
Technion, Israel, Spring 2001
ili po nashemu - "Suslik"
Prepared by:
                    Finkel Evgeny - maestro@cs.technion.ac.il     huy (4len)
                    Pozniansky Eli - kollega@cs.technion.ac.il
Under supervision of:
                    Prof. Benny Chor - benny@cs.technion.ac.il
And assistance of:
                    Esti Yager Lotem - estiy@cs.technion.ac.il

All the information was taken from:
http://www.ncbi.nlm.nih.gov/COG
The data was downloaded from:
ftp://ncbi.nlm.nih.gov/pub/COG
The description of the projects appears at:
http://www.cs.technion.ac.il/Labs/cbl/courses/236503/projects.html

Description of the Project:
* Genomic comparisons become a common tool for drug design. In order to develop a drug against a certain bacteria, the drug should be designed so that it specifically acts against the bacteria without harming the host. It should also not harm other bacteria, which act in a positive manner. To this end, a tool is needed that can find genes that are common to X genomes but do not appear in Y genomes.
The project's first part is to build a tool that accepts two lists of genomes, and searches for genes that are common to genomes on the first list but are missing from genomes on the second list.
* A related question is to define genomic profiles for group of genes. For each gene a vector is built, the size of the vector is the number of known genomes. Entry i in the vector answers the question: "Does the gene appear in genome i?". Then vectors can be compared.
The object of the second part of the project is to cluster genes according to their vectors, so that genes with identical vectors reside on the same cluster. The biological motivation for clustering is based on the assumption that genes having identical vectors probably have related functions.

Notes:
1) A lot of usefull information is accessible through tooltips.
2) The letters that appear next to genes in square brackets (e.g. '[M]') represent the genes' biological functions, as specified in legend (available through 'Legend' button).
3) For more details, please read the full final report of the project.

The following items are available for download: