Time comparison between the GenHunter algorithm and Sparse tree algorithm

 

         Running Environment :    Windows XP pro, P3 1200,1 GB RAM

 

 

File name

Number of loci

 

 

 

Number of people

 

 

 

Number of Founders

 

 

Number of
Non-founders

 

percentages of genotyped

 

 

runtime of sparse tree algorithm (STA)

in milliseconds

 

2^f

 

 

 

 

Runtime prediction of STA with founders reduction

In milliseconds

(STA / 2^f)

Runtime GH

In milliseconds

 

pedfile1 , datafile1

10

6

2

4

50

0

4

0

0

pedfile2 , datafile2

10

8

4

4

50

0

16

0

0

pedfile3 , datafile3

10

12

6

6

50

0

64

0

0

pedfile4 , datafile4

10

15

4

11

50

100

16

6.25

20

pedfile5 , datafile5

10

6

2

4

80

0

4

0

0

pedfile6 , datafile6

10

8

3

5

80

10

8

1.25

0

pedfile7 , datafile7

10

12

4

8

80

30

16

1.875

10

pedfile8 , datafile8

10

15

5

10

80

61

32

1.90625

10

pedfile9 , datafile9

50

6

2

4

50

70

4

17.5

20

pedfile10 , datafile10

50

8

4

4

50

20

16

1.25

10

pedfile11 , datafile11

50

12

4

8

50

180

16

11.25

20

pedfile12 , datafile12

50

15

5

10

50

1442

32

45.0625

80

pedfile13 , datafile13

50

6

3

3

80

10

8

1.25

0

pedfile14 , datafile14

50

8

3

5

80

20

8

2.5

10

pedfile15 , datafile15

50

12

4

8

80

80

16

5

20

pedfile16 , datafile16

50

15

4

11

80

1893

16

118.3125

210

pedfile17 , datafile17

100

6

3

3

50

0

8

0

0

pedfile18 , datafile18

100

8

3

5

50

0

8

0

10

pedfile19 , datafile19

100

12

5

7

50

931

32

29.09375

60

pedfile20 , datafile20

100

15

5

10

50

1452

32

45.375

210

pedfile21 , datafile21

100

6

3

3

80

10

8

1.25

10

pedfile22 , datafile22

100

8

3

5

80

30

8

3.75

20

pedfile23 , datafile23

100

12

4

8

80

481

16

30.0625

150

pedfile24 , datafile24

100

15

5

10

80

12368

32

386.5

1842

pedfile25 , datafile25

200

6

2

4

50

20

4

5

0

pedfile26 , datafile26

200

8

3

5

50

190

8

23.75

20

pedfile27 , datafile27

200

12

4

8

50

440

16

27.5

60

pedfile28 , datafile28

200

15

5

10

50

3375

32

105.46875

360

pedfile29 , datafile29

200

6

2

4

80

20

4

5

10

pedfile30 , datafile30

200

8

3

5

80

250

8

31.25

70

pedfile31 , datafile31

200

12

3

9

80

7662

8

957.75

1732

pedfile32 , datafile32

200

15

5

10

80

9124

32

285.125

631

pedfileGB0, datafileGB0

50

15

5

10

According the template

139411

32

4356.59

4827

pedfileGB010, datafileGB

200

15

5

9

10

180830

32

5650

4206

pedfileGB020, datafileGB

200

15

5

9

20

88096

32

2753

3615

 

Conclusions:

Ø      The runtime of the sparse tree algorithm (STA), is longer than the runtime of the GH algorithm, however when predicting the runtime of STA with founders reduction, the runtime of STA is usually much shorter than the runtime of GH.

 

back to main

 

 

The pedigree considered in the following table correspond to the data appears in pedfile32, datafile32 in the previous table (last line)

Datafile name

 

 

 

 

Pedfile name

 

 

 

 

Total number of homozygous positions over all loci and persons

 

 

runtime of sparse tree algorithm (STA)

in milliseconds

 

2^f

 

 

 

Runtime prediction of STA with founders reduction

In milliseconds

(STA / 2^f)

Runtime GH

In milliseconds

 

Runtime ratio

 

datafile32

pedfile32_0

0

125840

32

3932.5

16634

4.229879

datafile32

pedfile32_1

100

91591

32

2862.21

12608

4.404988

datafile32

pedfile32_2

200

71273

32

2227.28

11066

4.968392

datafile32

pedfile32_3

300

56492

32

1765.375

7481

4.237627

datafile32

pedfile32_4

400

46627

32

1457.09

7050

4.838411

datafile32

pedfile32_5

500

26578

32

830.56

5208

6.270468

datafile32

pedfile32_6

600

23603

32

737.59

4586

6.217546

 

 

 

 

 

 

Runtime comparison between GH and
Sparse tree algorithms

 
Text Box: Run time in milliseconds

 

Conclusions:

Ø      The more homozygous positions over all markers the faster the algorithms run (GH and STA). This occurs due to symmetry nodes
(as explained in section 7-logic-i).

Ø      The ratio between the runtime of STA and GH seems to increase with the increase of the homozygous marker positions, due to the tree sparseness.

 

back to main

 

The pedigree considered in the following table correspond 15 people, 5 founders, 100 loci, 80% genotyped.

Datafile name

 

 

 

 

Pedfile name

 

 

 

 

Number of typed founders

 

runtime of sparse tree algorithm (STA)

in milliseconds

 

2^f

 

 

 

Runtime prediction of STA with founders reduction

In milliseconds

(STA / 2^f)

Runtime GH

In milliseconds

 

Runtime ratio

 

Datafile24

pedfile24_0

0

105232

32

3288.5

5368

1.632355

Datafile24

pedfile24_1

1

51244

32

1601.375

3996

2.495356

Datafile24

pedfile24_2

2

26428

32

825.875

2984

3.613138

Datafile24

pedfile24_3

3

10826

32

338.3125

1682

4.971735

Datafile24

pedfile24_4

4

12308

32

384.625

1802

4.685083

Datafile24

pedfile24_5

5

13289

32

415.2813

1009

2.429679

 

Text Box: Run time in milliseconds

 

Conclusions:

Ø      The more genotyped founders the faster the algorithms run (GH and STA).
This occurs due to restrictions imposed on the founderís alleles assignment, which increases the sparseness of the tree.

back to main

 

The pedigree considered in the following table correspond 10 people, 4 founders, 200 loci

 

Datafile name

 

 

 

 

Pedfile name

 

 

 

 

Percentages of genotyped people per generation.

 

 

 

 

runtime of sparse tree algorithm (STA)

in milliseconds

 

2^f

 

 

 

Runtime prediction of STA with founders reduction

In milliseconds

(STA / 2^f)

Runtime GH

In milliseconds

 

 

 

Runtime ratio

 

 

 

DatafileP0

pedfileP0

0

10

16

0.625

10

16

DatafileP10

pedfileP10

10

90

16

5.625

10

1.777778

DatafileP20

pedfileP20

20

20

16

1.25

10

8

DatafileP30

pedfileP30

30

510

16

31.875

40

1.254902

DatafileP40

pedfileP40

40

190

16

11.875

30

2.526316

DatafileP50

pedfileP50

50

550

16

34.375

80

2.327273

DatafileP60

pedfileP60

60

300

16

18.75

70

3.733333

DatafileP70

pedfileP70

70

811

16

50.6875

160

3.156597

DatafileP80

pedfileP80

80

470

16

29.375

60

2.042553

DatafileP90

pedfileP90

90

280

16

17.5

80

4.571429

DatafileP100

pedfileP100

100

270

16

16.875

70

4.148148

 

Text Box: Run time in milliseconds

 

Conclusions:

Ø      The runtime seems to be shorter at the high end and low end of the graph, i.e. when the percentage of typed people is high or low,
this can be explained by the following reasons:

1.      High percentages of genotyped people impose many restrictions on the founderís alleles assignments, i.e. many impossible gene flows and many symmetry nodes, these shorten the calculations periods. (As explained in section 7-logic-i,ii).

2.      Low percentages genotyped people means low number on calculations, when searching the tree non-genotyped person means no contribution to the marker probability calculations, i.e. skipping to the next node without calculations.


back to main

 

The pedigree considered in the following table correspond 10 people, 4 founders, 80% genotyped

 

Datafile name

 

 

 

 

Pedfile name

 

 

 

 

Number of loci

 

 

 

runtime of sparse tree algorithm (STA)

in milliseconds

 

2^f

 

 

 

Runtime prediction of STA with founders reduction

In milliseconds

(STA / 2^f)

Runtime GH

In milliseconds

 

 

 

Runtime ratio

 

 

 

DatafileL50

pedfileL50

50

490

16

30.625

50

1.632653

DatafileL100

pedfileL100

100

641

16

40.0625

100

2.4961

DatafileL150

pedfileL150

150

911

16

56.9375

140

2.458836

DatafileL200

pedfileL200

200

961

16

60.0625

180

2.996878

DatafileL250

pedfileL250

250

1302

16

81.375

230

2.826421

DatafileL300

pedfileL300

300

1422

16

88.875

290

3.26301

DatafileL350

pedfileL350

350

1762

16

110.125

320

2.905789

DatafileL400

pedfileL400

400

1872

16

117

370

3.162393

 

Text Box: Run time in milliseconds

 

Conclusions:

Ø      The run time of both algorithms is linear with the number of loci.

1.      For GH the ratio is about: 0.9

2.      For SPA (predicted with founders reduction) the ratio is about: 0.25.

Ø      The runtime ratio between GH and SPA (considering founders reduction) tend to grow with the number of loci growth.

back to main