LAB IN BIOINFORMATICS

Experiments on Eight Known Benchmarks

under the guidance of  Dan Geiger

by Nickolay Dovgolevsky


Purpose

Comparing four algorithms for minimizing of the sum of table sizes produced during elimination on eight known benchmarks produced for Bayesian network inference and made available by the Research Unit of Decision Support Systems Aalborg University.

Summary of Results

A conclusion that one can draw from this experiment is that running a few stochastic greedy algorithms seems superior over the algorithm encoded in Hugin 6.1 (Link to the document with Hugin's algorithm) producing better cost in 7 out of the 8 cases within minutes. A comparison by running Hugin's algorithm on the same computer still needs to be performed. Also applying some reduction rules, a practice which was found useful in other cases, has not yet been tried for these eight datasets. Another conclusion is that the greedy algorithm performs as well as simulated annealing. 
* The greedy algorithms have not been written in the most efficient way so far.

Detailed Results

The yellow entries show instances were the greedy algorithm performs as well as or better than simulated annealing. As can be seen, this is the case in three instances out of the eight.
Each entry consists of two numbers. The first number is the cost found, and the second number is either the time required (in the first table -- where the number of iterations is fixed) or the number of iterations performed (in the second table -- where the running time is fixed).

Hugin format Weighted graph Best cost known Hugin cost
100 iterations 1,000 iterations 10,000 iterations
SG WSG Min-Fill WMin-Fill SG WSG Min-Fill WMin-Fill SG WSG Min-Fill WMin-Fill
Barley graph1 17,140,796 17,338,949  17,296,689 0   19,480,525 1 17,287,968   2  18,223,019   2   17,295,866   3           -           5 17,140,941  19  18,025,379 22  17,277,506   29 18,780,501  41 17,140,796    4:17 17,338,949 3:27
Diabetes graph2 9,825,960 10,415,407 43,067,549  3 10,273,507 15 15,586,343  35 12,378,096  39   26,541,895  23 10,131,607 2:37 13,756,430 5:36 12,113,520 6:33 25,288,104  1:47       -        21:09 13,739,124 1:02:31 12,040,920 1:06:50
Link graph3 24,008,666 26,179,418

 29,744,834 4

43,069,002  47

27,607,466  2:13 37,835,466 2:26  27,186,466   46 41,495,506  7:55 26,560,002  23:25 31,361,266 27:37 26,922,738  9:33 39,735,842 1:35:47 26,556,426 3:51:32 29,395,138 4:07:07
Munin1 graph4 86,901,006 188,387,156 183,527,046  1 183,609,451  4 154,288,068  11 138,739,862 11 183,483,166  10        -           36 87,550,167 1:44         -           1:53 183,445,293  1:35 163,453,271  11:09       -        16:59 134,302,362 17:09
Munin2 graph5 2,049,942 2,762,938  4,196,908   7 4,053,914   1:36 3,527,940  3:06 4,355,326 3:00  4,143,567    1:04  3,787,029  16:13        -         30:29 4,354,486 30:06 4,119,658    10:43  3,747,873    2:56:91  3,005,394 4:56:37 4,354,470 4:51:09
Munin3 graph6 3,079,164 3,241,805  3,263,306    6 3,281,411 1:44 3,260,732 3:43 3,112,308 3:48  3,259,877    1:07  3,268,047 17:42 3,202,545 36:27 3,105,680 39:06         -          11:04     -       3:09:38 3,167,961 6:11:00    -    6:08:52
Munin4 graph7 9,837,096 16,416,346 19,858,510  7 17,188,621 1:44 20,005,393 3:43 13,719,391 3:46 15,057,970 1:07  15,840,033  20:57 20,005,393  36:37 13,434,815 36:42        -          11:31  14,406,211  3:14:13 18,525,118 6:02:33 13,156,642 6:02:15
Water graph8 3,028,305 8,035,356   8,035,356  0   3,662,688  0  3,028,305    1    3,465,948   2            -            2  3,617,535  2         -          11          -          16            -            21 3,465,948    25         -        2:45       -       2:32

 

 

 

 

 

 

 

 

 

 

 

 

Hugin format Weighted graph Best cost known Hugin cost
1 minute 10 minutes 100 minutes
SG WSG Min-Fill WMin-Fill SG WSG Min-Fill WMin-Fill SG WSG Min-Fill WMin-Fill
Barley graph1 17,140,796 17,338,949 17,247,924  21,348 25,948,259 16,531 17,248,033 3,040 17,338,949 2,857 17,247,266 234,285 24,960,059 166,017 17,140,796 29,988 17,247,779 28,761 17,207,604 2,257,980 18,433,619 1,654,778    -     300,482 17,141,309 288,013
Diabetes graph2 9,825,960 10,415,407 66,879,032 1,848 10,628,257  401 16,320,656 87  12,711,810 153 44,955,417 19,365 10,557,307 4,123 14,126,083 854 12,151,356 1,519 35,295,468 197,601       -       42,084 13,597,126 8,649 11,730,558 15,276
Link graph3 24,008,666 26,179,418 30,373,938 1,211 79,930,890 129 28,892,994 46 37,835,450 41 26,903,310 11,875 79,830,178 1,304          -           489 33,814,130 419 25,178,130 122,814 79,617,162 12,998 28,391,880 4,793 29,158,850 4,210
Munin1 graph4 86,901,006 188,387,156 112,604,069 6,004 195,370,429 1,755 87,564,087 957 138,739,862 551 87,725,307 59,638   -      18,039           -           9,716 134,302,362 5,603 83,924,630 601,469    -    179,903 87,549,937 97,940   -   55,980
Munin2 graph5 2,049,942 2,762,938 4,511,124 983 4,034,236  97 3,829,163 31 4,410,230 34 4,301,622 9,994 4,033,995 983 3,482,679 317 4,355,334 336 4,183,361 99,323 4,033,736 9,805 3,286,892 3,097 4,354,486 3,387
Munin3 graph6 3,079,164 3,241,805 3,276,379 1,011 3,292,618  98 3,377,594 29 3,113,340 27    -     10,054 3,292,436 1,001 3,221,289 297 3,112,140 281 3,259,559 100,701 3,292,354 9,872 3,175,751 2,969 3,110,772 2,829
Munin4 graph7 9,837,096 16,416,346 24,393,717 1,004 16,810,858 97 22,764,167 30 13,783,543 27 19,857,396 1,041 16,810,785 969 20,227,882 291 13,552,815 284 16,857,845 100,217 16,611,063 9,907 18,818,629 2,890 13,233,155 2,840
Water graph8 3,028,305 8,035,356 3,465,948 27,870 8,035,356 23,540 3,028,305 5,611 3,465,948 3,947 3,363,726 278, 691     -      235,571      -       56,098      -      40,050 3,362,268 2,786,939  -  2,355,723     -      559,491 3,323,217 401,819

 

 

 

 

 

 

 

 

 

 

'-' - the same as for less time / iterations. 

Acknowledgements:

We thank the Research Unit of Decision Support Systems Aalborg University for providing the above instances of probabilistic networks.