Time comparison between the GenHunter
algorithm and Sparse tree algorithm
· Running Environment : Windows XP pro, P3 1200, 1 GB RAM
File name |
Number of loci |
Number of people |
Number of Founders |
Number of |
percentages of genotyped |
runtime of sparse tree algorithm (STA) in milliseconds |
2^f |
Runtime prediction of STA with founders reduction In milliseconds (STA / 2^f) |
Runtime GH In milliseconds |
10 |
6 |
2 |
4 |
50 |
0 |
4 |
0 |
0 |
|
10 |
8 |
4 |
4 |
50 |
0 |
16 |
0 |
0 |
|
10 |
12 |
6 |
6 |
50 |
0 |
64 |
0 |
0 |
|
10 |
15 |
4 |
11 |
50 |
100 |
16 |
6.25 |
20 |
|
10 |
6 |
2 |
4 |
80 |
0 |
4 |
0 |
0 |
|
10 |
8 |
3 |
5 |
80 |
10 |
8 |
1.25 |
0 |
|
10 |
12 |
4 |
8 |
80 |
30 |
16 |
1.875 |
10 |
|
10 |
15 |
5 |
10 |
80 |
61 |
32 |
1.90625 |
10 |
|
50 |
6 |
2 |
4 |
50 |
70 |
4 |
17.5 |
20 |
|
50 |
8 |
4 |
4 |
50 |
20 |
16 |
1.25 |
10 |
|
50 |
12 |
4 |
8 |
50 |
180 |
16 |
11.25 |
20 |
|
50 |
15 |
5 |
10 |
50 |
1442 |
32 |
45.0625 |
80 |
|
50 |
6 |
3 |
3 |
80 |
10 |
8 |
1.25 |
0 |
|
50 |
8 |
3 |
5 |
80 |
20 |
8 |
2.5 |
10 |
|
50 |
12 |
4 |
8 |
80 |
80 |
16 |
5 |
20 |
|
50 |
15 |
4 |
11 |
80 |
1893 |
16 |
118.3125 |
210 |
|
100 |
6 |
3 |
3 |
50 |
0 |
8 |
0 |
0 |
|
100 |
8 |
3 |
5 |
50 |
0 |
8 |
0 |
10 |
|
100 |
12 |
5 |
7 |
50 |
931 |
32 |
29.09375 |
60 |
|
100 |
15 |
5 |
10 |
50 |
1452 |
32 |
45.375 |
210 |
|
100 |
6 |
3 |
3 |
80 |
10 |
8 |
1.25 |
10 |
|
100 |
8 |
3 |
5 |
80 |
30 |
8 |
3.75 |
20 |
|
100 |
12 |
4 |
8 |
80 |
481 |
16 |
30.0625 |
150 |
|
100 |
15 |
5 |
10 |
80 |
12368 |
32 |
386.5 |
1842 |
|
200 |
6 |
2 |
4 |
50 |
20 |
4 |
5 |
0 |
|
200 |
8 |
3 |
5 |
50 |
190 |
8 |
23.75 |
20 |
|
200 |
12 |
4 |
8 |
50 |
440 |
16 |
27.5 |
60 |
|
200 |
15 |
5 |
10 |
50 |
3375 |
32 |
105.46875 |
360 |
|
200 |
6 |
2 |
4 |
80 |
20 |
4 |
5 |
10 |
|
200 |
8 |
3 |
5 |
80 |
250 |
8 |
31.25 |
70 |
|
200 |
12 |
3 |
9 |
80 |
7662 |
8 |
957.75 |
1732 |
|
200 |
15 |
5 |
10 |
80 |
9124 |
32 |
285.125 |
631 |
|
50 |
15 |
5 |
10 |
According the template |
139411 |
32 |
4356.59 |
4827 |
|
200 |
15 |
5 |
9 |
10 |
180830 |
32 |
5650 |
4206 |
|
200 |
15 |
5 |
9 |
20 |
88096 |
32 |
2753 |
3615 |
Conclusions:
Ø The runtime of the sparse tree algorithm
(STA), is longer than the runtime of the GH algorithm, however when predicting
the runtime of STA with founders reduction, the runtime of STA is usually much
shorter than the runtime of GH.
The
pedigree considered in the following table correspond to the data appears in
pedfile32, datafile32 in the previous table (last line) |
|||||||
Datafile name |
Pedfile name |
Total number of homozygous positions over all loci and persons |
runtime of sparse tree algorithm (STA) in milliseconds |
2^f |
Runtime prediction of STA with founders reduction In milliseconds (STA / 2^f) |
Runtime GH In milliseconds |
Runtime ratio |
0 |
125840 |
32 |
3932.5 |
16634 |
4.229879 |
||
100 |
91591 |
32 |
2862.21 |
12608 |
4.404988 |
||
200 |
71273 |
32 |
2227.28 |
11066 |
4.968392 |
||
300 |
56492 |
32 |
1765.375 |
7481 |
4.237627 |
||
400 |
46627 |
32 |
1457.09 |
7050 |
4.838411 |
||
500 |
26578 |
32 |
830.56 |
5208 |
6.270468 |
||
600 |
23603 |
32 |
737.59 |
4586 |
6.217546 |
Runtime comparison between GH and
Sparse tree algorithms
Conclusions:
Ø
The more homozygous positions over
all markers the faster the algorithms run (GH and STA). This occurs due to
symmetry nodes
(as explained in section 7-logic-i).
Ø
The ratio between the runtime of
STA and GH seems to increase with the increase of the homozygous marker
positions, due to the tree sparseness.
The
pedigree considered in the following table correspond 15 people, 5 founders,
100 loci, 80% genotyped. |
|||||||
Datafile name |
Pedfile name |
Number of typed founders |
runtime of sparse tree algorithm (STA) in milliseconds |
2^f |
Runtime prediction of STA with founders reduction In milliseconds (STA / 2^f) |
Runtime GH In milliseconds |
Runtime ratio |
0 |
105232 |
32 |
3288.5 |
5368 |
1.632355 |
||
1 |
51244 |
32 |
1601.375 |
3996 |
2.495356 |
||
2 |
26428 |
32 |
825.875 |
2984 |
3.613138 |
||
3 |
10826 |
32 |
338.3125 |
1682 |
4.971735 |
||
4 |
12308 |
32 |
384.625 |
1802 |
4.685083 |
||
5 |
13289 |
32 |
415.2813 |
1009 |
2.429679 |
Conclusions:
Ø
The more genotyped founders the
faster the algorithms run (GH and STA).
This occurs due to restrictions imposed on the founder’s alleles
assignment, which increases the sparseness of the tree.
The
pedigree considered in the following table correspond 10 people, 4 founders,
200 loci |
|
|||||||
Datafile name |
Pedfile name |
Percentages of genotyped people per generation. |
runtime of sparse tree algorithm (STA) in milliseconds |
2^f |
Runtime prediction of STA with founders reduction In milliseconds (STA / 2^f) |
Runtime GH In milliseconds |
Runtime ratio |
|
0 |
10 |
16 |
0.625 |
10 |
16 |
|||
10 |
90 |
16 |
5.625 |
10 |
1.777778 |
|||
20 |
20 |
16 |
1.25 |
10 |
8 |
|||
30 |
510 |
16 |
31.875 |
40 |
1.254902 |
|||
40 |
190 |
16 |
11.875 |
30 |
2.526316 |
|||
50 |
550 |
16 |
34.375 |
80 |
2.327273 |
|||
60 |
300 |
16 |
18.75 |
70 |
3.733333 |
|||
70 |
811 |
16 |
50.6875 |
160 |
3.156597 |
|||
80 |
470 |
16 |
29.375 |
60 |
2.042553 |
|||
90 |
280 |
16 |
17.5 |
80 |
4.571429 |
|||
100 |
270 |
16 |
16.875 |
70 |
4.148148 |
|||
Conclusions:
Ø
The runtime seems to be shorter at
the high end and low end of the graph, i.e. when the percentage of typed people
is high or low,
this can be explained by the following reasons:
1. High percentages of genotyped people impose many restrictions on the founder’s alleles assignments, i.e. many impossible gene flows and many symmetry nodes, these shorten the calculations periods. (As explained in section 7-logic-i,ii).
2.
Low percentages genotyped people means
low number on calculations, when searching the tree non-genotyped person means no
contribution to the marker probability calculations, i.e. skipping to the next
node without calculations.
The
pedigree considered in the following table correspond 10 people, 4 founders,
80% genotyped |
|
|||||||
Datafile name |
Pedfile name |
Number of loci |
runtime of sparse tree algorithm (STA) in milliseconds |
2^f |
Runtime prediction of STA with founders reduction In milliseconds (STA / 2^f) |
Runtime GH In milliseconds |
Runtime ratio |
|
50 |
490 |
16 |
30.625 |
50 |
1.632653 |
|||
100 |
641 |
16 |
40.0625 |
100 |
2.4961 |
|||
150 |
911 |
16 |
56.9375 |
140 |
2.458836 |
|||
200 |
961 |
16 |
60.0625 |
180 |
2.996878 |
|||
250 |
1302 |
16 |
81.375 |
230 |
2.826421 |
|||
300 |
1422 |
16 |
88.875 |
290 |
3.26301 |
|||
350 |
1762 |
16 |
110.125 |
320 |
2.905789 |
|||
400 |
1872 |
16 |
117 |
370 |
3.162393 |
|||
Conclusions:
Ø
The run time of both algorithms is
linear with the number of loci.
1.
For GH the ratio is about: 0.9
2.
For SPA (predicted with founders
reduction) the ratio is about: 0.25.
Ø
The runtime ratio between GH and
SPA (considering founders reduction) tend to grow with the number of loci
growth.