Haplotyping By MPE Query ## by Sofia Grinberg and Adi Mano.

### Implementation

The implementation is based on SuperLink's implementation of the SuperMLink query. One of the main differences from the SuperMLink query is that maximization is done where summation had place before. Meaning, that in the elimination variable process, we select the best values for the variable under the constrains of other variables in its function. These values are collected during the variable elimination and this is the "Backwards" stage of the algorithm presented before. In the next stage we analyze the collected information and this is the "Forward" stage.

Our program can also retrieve K different solutions to the Haplotyping problem. That is K solutions (if exist) with the same maximum probability (maximum LOD score). In order to add this ability to the project (without making major changes) we create an array in size of K that holds for each K the MPE_arg obtained in the backward phase for this Ks solution. The extension of the solution is done upon variable conditioning (variable breaking), where different sub trees are created for each value of the break node. If the maximum score can be obtained by more then one sub tree then an extension of the solution can be performed.

We also allow to calculate K results within a predefined interval from the highest score that can be obtained by the query. For example if the highest score is -1031.35 then assignments with the score of -1033.55 can be also obtained (for some given epsilon).

When ever a new solution is found we calculate a delta between the new solution and the best one found so far. If this delta is within the boundaries of the defined interval the solutions is saved. For the solutions with lower probabilities (could be the old or new solutions) the delta is kept. At the forward phase the solution are bubble sorted in order to be printed in order of their probability.

### Results

We tested our project in several ways. At first we checked the correctness of the flow on a small pedigree. That way we could compute and compare the exact calculation during the all process. In the second stage, we tested our program on 6 sets of different pedigrees examples, which appear on the SuperLink's web.

The second stage of testing consists of two groups: Comparison of the Haplotyping query vs. SuperMLink program on the performance issue. Comparison of the our Haplotyping query vs. Ma'ayan's version. In this group of tests we compared the MPE probability and  performance.

We also tested our program on the time consumption for different parameters, like the numbers of solution required (K) or the size of the interval that allowed to hold the solution (epsilon).

The Results Summary Full report (.doc) Source code (.exe) 