# Comparison of Metabolic Pathways

## by Silberklang Dana and Karp Amit.

### Problem

Our goal is to compare a metabolic pathway of a certain organism against the same metabolic pathways in Each enzyme is characterized by the reaction it catalyzes, or compare a metabolic pathway against other metabolic pathways in the same organism.

We represent a metabolic pathway by a graph whose nodes correspond to enzymes that catalyze the pathway's reactions, and the edges connect two nodes if for the corresponding enzymes the product of one serves as the substrate of the other. Thus, the problem is: Given a pattern tree P and a Text Tree T, find a subtree of T which is isomorphic to P or decide that there is no such tree.

When looking for the best match, we take into account both label similarity and topology. We are permited to delete only degree 2 vertexes from the text tree. And we  are NOT permited to delete vertexes from the pattern tree.

A previous algorithm to solve the problem ONLY for tree-like graphs already exists today.

### Implementation

We used the Modified decision Tree algorithm in order to solve the given problem.

The idea is to perform a decision tree search over the space {u,v½uÎV1; vÎV2 È {f} }, when Each node in the search tree represents a vertex in V1 that has either been matched with a vertex in V2 or with f (deletion). Each pathway from root to leaves in the search tree, represents a match of all vertices in V1 to vertices in V2 (or f).

The search tree advances from root to leaves, trying to find a new G1, G2 match: A vertex (V1,V2) is added to the tree if it V1 can be matched to V2 , when all pathway matches have already been made. In order to allow deletions, we store all simple paths of length up to a predefined length in a preprocessing procedure.

A cost is assigned to the matching between every two vertices and the cost of a partial matching consists of the sum of the costs of the matched vertices minus the deletion penalty. The tree search is expanded only to states which have the potential to overcome the score of the final states already found in the search tree.

The Decision Tree advances in a DFS style, causing it to store only one tree path (only one match between the pattern and tree graphs) at each step. Thus the space complexity of the algorithm is O(m), when 'm' is the number of nodes in a graph.

Example of a decision tree:

### Results

When running the program on a variety of Pattern and Text graphs, which have from 1 to 20 nodes, some containing inner circles and some don't, we found out the actual time results are fast (faster than the existing algorithm) , although the time complexity of the algorithm is high (exponential). The reason for this is that most of the Decision tree paths are not executed at all, causing the actual tree to be sparse.

We tested the Decision tree algorithm against the existing algorithm on many tree-like graphs and found the results to be the same (except some minor bugs found in the existing algorithm).

The final test was done by taking 80 E.Coli metabolic pathways (some of them include inner circles) and trying to match them against each other. The 6400 results include for each Pattern and Text graph the size of the graphs, the algorithm running time, and the 5 best matches found for them.

The full results file can be found here.

Here is a selection of the results, which have the worst timings:

 Text File: Pattern File: Text Nodes Num Pattern Nodes Num Time [sec] score1 score2 score3 score4 score5 yeast64.grp yeast57.grp 20 14 0.344 yeast64.grp yeast65.grp 20 17 0.344 yeast64.grp yeast64.grp 20 20 0.297 0 yeast64.grp yeast17.grp 20 15 0.281 yeast64.grp yeast79.grp 20 10 0.25 -4.143 0 -6.917 -6.917 -6.917 yeast65.grp yeast79.grp 17 10 0.25 -63.017 -61.779 -63.779 -63.017 -65.017 yeast64.grp yeast77.grp 20 9 0.219 yeast64.grp yeast31.grp 20 10 0.203 -74.517 -74.116 -74.517 -74.116 -74.517 yeast64.grp yeast80.grp 20 7 0.203 yeast65.grp yeast31.grp 17 10 0.203 -75.66 -75.66 -75.66 -75.66 -75.66 yeast65.grp yeast65.grp 17 17 0.187 0 yeast57.grp yeast79.grp 14 10 0.172 yeast64.grp yeast9.grp 20 9 0.172 -65.714 -65.714 -65.714 -65.714 -66.952 yeast64.grp yeast56.grp 20 6 0.172 yeast65.grp yeast17.grp 17 15 0.172 yeast65.grp yeast64.grp 17 20 0.172 yeast65.grp yeast80.grp 17 7 0.172 yeast65.grp yeast77.grp 17 9 0.171 yeast65.grp yeast2.grp 17 9 0.141 -62.1 -62.1 -62.1 -62.1 -62.1 yeast65.grp yeast56.grp 17 6 0.141 -46.11 -46.11 -46.11 -46.11 -46.11 yeast17.grp yeast79.grp 15 10 0.14 yeast64.grp yeast1.grp 20 9 0.14 -24.908 -67.613 -30.417 -29.176 -29.176 yeast17.grp yeast65.grp 15 17 0.125 yeast42.grp yeast52.grp 6 3 0.125 -24.652 -24.652 -24.652 -24.652 -24.652