# Preprocessing Reduction of Weighted Graphs

##### under the guidance of  Dan Geiger

by Nickolay Dovgolevsky

Purpose

Reduction of a weighted graph without increasing its weighted treewidth by set of safe reduction rules defined in [1].

Description of the problem

We can preprocess a graph by applying reduction rules that reduce it. This reduction has a property that a tree decomposition for the reduced graph with minimum treewidth can easily be extended to a tree decomposition for the original graph that also has minimum treewidth. A rule that does not change the treewidth of the graph is called safe. To allow more reduction rules we define a variable low, that is a lower bound of the treewidth of the graph. A reduction rule is safe if and only if the maximum of low is not changed. Preprocessing applies a series of safe reduction rules, taken from a set, on a graph, until no more reduction rules can be applied. When the preprocessing results in an empty graph, we can trivially find a tree decomposition with minimum treewidth for the original graph. When the graph resulting after preprocessing is not empty, either the treewidth can be computed for the remaining graph by an exact, but possibly computational expensive algorithm (when the resulting graph is relatively small), or the treewidth can be approximated using heuristic algorithms.

Preprocessing a graph by applying safe reduction rules takes on additional significance when using stochastic version of an algorithm. Because stochastic algorithms are usually applied a large number of times (at least 100), applying them on the reduced graph for the same number of iterations saves time. Using the saved time reduces the elimination cost found.

Preprocessing a graph may be used not only for finding treewidth or its approximation. For example, in many applications the problem is  to find the sum of neighborhood weights of eliminated vertices, also called the total state size. Despite the fact that safe rules do not change the treewidth of a graph, they nevertheless may increase the total state size. However, in practice, also for this problem, reduction rules reduce the running time and improve the cost.

I implemented the following reduction rules, that were defined in [1]:

1. Simplicial rule.
2. Almost simplicial rule.
3. Buddies rule (generalized).
4. Cube rule.

The variable low is an updated lower bound on treewidth computed while applying the simplicial reduction rule. The other rules remove vertices, the elimination cost of which is less or equal to low.

Experimental Results

First Experiment

This experiment is conducted on 20 probabilistic networks. The Diabetes, Link, Munin networks are taken from medical applications, Barley is used for agricultural purposes, the Water network models a water purification, and Superlink networks were generated by Superlink - a genetic linkage analysis program. Table 1 shows some results of the preprocessing technique. The preprocessing consists of 3 phases; for each phase a set of reduction rules is employed. In the first phase the set Prepro-1 = {simplicial rule} is used, for the second we applied the set Prepro-2 = {simplicial rule}, and in the third - the set Prepro-3 = {simplicial rule, Almost simplicial rule, Buddies rule and Cube rule}. The Buddies rule is applied separately for "buddies set" size = 2, 3, and 4. For each instance the size of the graph is shown after moralization, and after first, second and third phases. After each phase we also provide the percentage of vertices deleted by each reduction rule in the set, the value of variable low, representing the lower bound currently known for the weighted treewidth, and the running time of the preprocessing. The Buddies rule  with "buddies set" size = 4, and the Cube rule did not result in elimination of vertices, so there is no need in generalization of the Cube rule, as well as is running Buddies rule with set size more then 4. A summary explaining all table entries is available here.

 Instance Undirected graph Prepro-1 Prepro-2 Prepro-3 |V| |E| %V #smp low t %t %V #smp #asm low t %t %V #smp #asm #bud #bud3 #bud4 #cub low t %t Barley 48 126 27.1 13 40320 0.00 0.00 39.6 14 5 40320 0.00 0.00 39.6 14 5 0 0 0 0 40320 0.00 0.00 Diabetes 413 819 18.8 78 7056 0.00 0.00 19.6 78 3 7056 0.00 0.00 19.6 78 3 0 0 0 0 7056 0.01 4.76 Link 724 1738 31.7 230 128 0.00 0.00 53.2 230 155 128 0.00 0.00 53.2 230 155 0 0 0 0 128 0.02 7.13 Munin1 189 366 42.9 81 600 0.00 0.00 52.2 82 17 600 0.00 0.00 52.2 82 17 0 0 0 0 600 0.00 0.00 Munin2 1003 1662 55.2 554 600 0.00 0.00 68.4 564 122 600 0.00 0.00 68.4 564 122 0 0 0 0 600 0.03 9.56 Munin3 1044 1745 59.9 625 600 0.00 0.00 82.4 652 208 2000 0.00 0.00 83.5 652 208 12 0 0 0 2000 0.03 9.18 Munin4 1041 1843 58.1 605 600 0.00 0.00 74.0 614 156 2000 0.00 0.00 74.0 614 156 0 0 0 0 2000 0.02 6.83 Water 32 123 25.0 8 3072 0.00 0.00 31.3 8 2 3072 0.00 0.00 31.3 8 2 0 0 0 0 3072 0.00 0.00 Superlink0_0 1676 2987 19.9 333 2000 0.02 1.40 58.0 407 578 2000 0.03 5.54 58.0 407 578 0 0 0 0 2000 0.10 18.7 Superlink1_1 2449 4320 19.3 473 2000 0.00 0.00 57.7 598 817 2000 0.06 6.74 57.7 596 817 2 0 0 0 2000 0.34 33.2 Superlink2_2 3224 5641 19.9 643 2000 0.02 1.19 54.5 779 979 2000 0.08 7.14 54.7 779 981 2 0 0 0 2000 0.30 16.7 Superlink0_19 3254 5754 23.9 779 2000 0.02 1.59 59.0 906 1013 2000 0.06 4.41 59.0 906 1013 0 0 0 0 2000 0.30 21.9 Superlink1_20 4720 8211 24.2 1142 2000 0.01 0.75 57.8 1339 1391 2000 0.15 7.58 57.7 1332 1391 2 0 0 0 2000 0.75 29.8 Superlink2_21 6242 10814 24.1 1503 2000 0.03 0.77 54.7 1722 1694 2000 0.16 4.98 55.2 1724 1717 2 3 0 0 2000 0.63 18.3 Superlink0_38 4859 8517 25.2 1226 2000 0.03 1.32 59.6 1428 1469 2000 0.12 6.85 59.6 1426 1469 2 0 0 0 2000 0.56 23.6 Superlink1_39 7040 12104 25.0 1762 2000 0.05 1.17 60.0 2106 2123 2000 0.28 9.14 60.0 2090 2123 16 0 0 0 2000 1.32 31.7 Superlink2_40 9276 15926 24.7 2293 2000 0.05 0.77 55.3 2654 2479 2000 0.30 7.47 55.6 2650 2502 8 3 0 0 2000 1.56 24.3 Superlink0_57 6372 11198 26.0 1659 2000 0.05 1.76 57.8 1883 1798 2000 0.17 5.87 57.8 1881 1798 2 3 0 0 2000 0.78 26.9 Superlink1_58 9292 16096 25.8 2401 2000 0.06 1.05 57.9 2784 2594 2000 0.37 7.73 57.9 2768 2594 16 3 0 0 2000 2.23 38.9 Superlink2_59 12278 21215 25.6 3140 2000 0.06 0.67 54.3 3560 3104 2000 0.54 7.78 54.5 3554 3127 10 6 0 0 2000 6.02 73.2 Average: 3759 6560 30.1 977 3849 0.02 0.54 55.4 1120 1035 3.989 0.12 4.06 55.5 1118 1039 3.7 0.9 0 0 3989 0.75 19.8

Second Experiment

The next experiment has been conducted on 100 graphs to check whether the time saved due to the reductions can be used by running more iterations to improve the cost. Each graph was processed in four different ways: using the minimum fill-in algorithm with and without reductions and using the weighted minimum fill-in algorithm with and without reductions where the reductions used are the simplicial vertex elimination, the almost simplicial vertex elimination, and the buddy rule (with buddy sets of 3 or 4). The first experiment clearly shows that this is an appropriate choice of reduction rules. Each of the four runs was given 1000 seconds.

Summary of Results

The results show that these reductions use a very small amount of time and increase dramatically the number of stochastic iterations that the two min-fill algorithms can run in this time framework of 1000 seconds. Furthermore, the resulting cost was also improved in most cases. In particular, Min-Fill with reductions (as preprocessing) is superior over Min-Fill without reductions in 68% of the graphs. In this case the average improvement in cost is ~4.2% which means an average reduction of ~60% in memory requirements. If reductions are applied on all graphs the average improvement is ~2.9% which means an average reduction of ~43% in memory requirements. In 11% of the graphs Min-Fill found the same cost with and without reduction.
The result for the weighted version are similar. The weighted Min-Fill algorithm with reductions is superior over the weighted Min-Fill algorithm without reductions in 67% of the graphs. In this case the average improvement in cost is ~4.4%. Assuming the reductions are applied on all graphs (namely, also when they do not help) the average improvement reduces to ~2.8%. In 12% of the graphs the algorithm weighted Min-Fill found the same cost with and without reduction.

Detailed Results

Each cell in the table below contains two numbers. The cost found, namely, log10 of the sum of table sizes produced during elimination, and the number of iterations that fit into 1000 seconds. In addition, for each algorithm the lowest cost, with versus without reductions, is underlined. If equal, both are underlined.

 Weighted graphs pedfile datafile Number of loci Number of people % of typed persons MIN_FILL WMIN_FILL No prepr With prepr No prepr prepr graph00 pedfile00 datafile00 12 25 80 3.95 6676 3.95 38385 3.94 3634 3.95 20616 graph01 pedfile01 datafile01 14 27 63 14.10 495 14.46 519 13.63 264 13.64 282 graph02 pedfile02 datafile02 16 22 89 5.28 4415 5.18 13066 5.24 2259 5.08 6494 graph03 pedfile03 datafile03 20 31 71 10.22 1167 10.12 1597 10.09 593 9.92 810 graph04 pedfile04 datafile04 17 29 56 11.35 1077 11.16 1263 10.11 597 10.15 712 graph05 pedfile05 datafile05 19 33 95 5.51 2561 5.46 8495 5.49 1410 5.37 4674 graph06 pedfile06 datafile06 18 31 47 8.84 1628 7.81 3202 8.43 905 7.53 1879 graph07 pedfile07 datafile07 13 30 30 2.80 22253 2.81 1301257 2.80 12772 2.81 733295 graph08 pedfile08 datafile08 22 38 86 6.56 1473 6.42 4127 6.32 817 6.35 2215 graph09 pedfile09 datafile09 20 48 91 8.12 820 7.80 1590 8.00 463 7.92 900 graph10 pedfile10 datafile10 25 38 84 13.45 395 12.39 608 11.89 232 11.67 370 graph11 pedfile11 datafile11 23 45 78 12.66 393 12.87 529 13.42 215 12.90 295 graph12 pedfile12 datafile12 19 47 74 6.25 1325 6.09 3114 6.23 718 6.07 1677 graph13 pedfile13 datafile13 21 44 80 10.95 583 11.01 873 11.00 314 10.94 469 graph14 pedfile14 datafile14 27 42 90 17.17 315 16.80 436 16.52 166 16.53 234 graph15 pedfile15 datafile15 25 42 70 13.49 381 13.58 489 13.35 203 13.57 268 graph16 pedfile16 datafile16 21 37 87 7.24 1521 7.05 5795 7.11 829 6.77 3189 graph17 pedfile17 datafile17 24 43 93 6.03 1240 5.81 4764 5.95 691 5.84 2605 graph18 pedfile18 datafile18 30 40 80 10.14 480 10.27 847 10.08 264 9.92 481 graph19 pedfile19 datafile19 34 38 100 6.58 791 6.55 2434 6.43 444 6.38 1355 graph20 pedfile20 datafile20 32 40 85 15.56 193 13.13 293 15.69 103 14.11 162 graph21 pedfile21 datafile21 31 56 77 13.58 255 13.51 372 13.56 139 13.45 199 graph22 pedfile22 datafile22 24 38 69 12.04 540 11.53 722 12.77 284 11.56 384 graph23 pedfile23 datafile23 29 41 74 6.94 784 6.84 3707 7.01 429 6.73 2014 graph24 pedfile24 datafile24 26 34 88 9.60 726 8.99 1745 9.64 378 8.77 911 graph25 pedfile25 datafile25 28 37 64 8.81 669 8.26 1764 8.93 344 8.18 879 graph26 pedfile26 datafile26 20 50 80 13.97 367 13.60 628 14.40 193 12.52 338 graph27 pedfile27 datafile27 18 51 97 8.72 1200 8.43 2980 8.22 661 7.66 1729 graph28 pedfile28 datafile28 37 32 83 9.68 473 9.56 766 9.63 241 9.57 396 graph29 pedfile29 datafile29 20 36 57 12.32 660 11.77 877 12.60 304 11.88 418 graph30 pedfile30 datafile30 14 49 67 15.42 603 15.16 784 14.46 269 13.67 334 graph31 pedfile31 datafile31 17 26 73 4.39 4016 4.29 17593 4.43 2082 4.26 9062 graph32 pedfile32 datafile32 15 24 36 3.09 41589 3.10 1339162 3.09 22362 3.10 680463 graph33 pedfile33 datafile33 18 25 19, 36, 62, 84 11.17 1165 10.26 1758 10.33 590 10.26 877 graph34 pedfile34 datafile34 19 33 14, 31, 48, 66, 85 18.89 165 19.31 172 16.54 94 16.86 100 graph35 pedfile35 datafile35 15 38 22, 44, 66, 88 17.04 344 16.90 366 16.48 153 15.91 162 graph36 pedfile36 datafile36 16 23 13, 29, 40, 55, 74, 87 10.64 1728 10.34 2463 9.17 940 8.95 1357 graph37 pedfile37 datafile37 24 30 15, 29, 49, 66, 79 22.52 93 21.84 100 17.84 68 17.03 75 graph38 pedfile38 datafile38 21 35 22, 37, 62, 83 23.20 100 22.46 103 17.37 62 17.27 68 graph39 pedfile39 datafile39 27 28 15, 30, 49, 61, 80 13.59 380 13.35 421 13.52 194 12.97 220 graph40 pedfile40 datafile40 17 22 12, 26, 44, 56, 69, 82 17.53 491 16.96 547 13.49 283 13.32 315 graph41 pedfile41 datafile41 25 32 18, 34, 45, 67, 83 18.30 191 17.20 215 17.97 106 16.97 118 graph42 pedfile42 datafile42 20 27 18, 39, 62, 81 19.07 147 18.56 156 16.33 92 16.37 99 graph43 pedfile43 datafile43 25 22 17, 32, 49, 60, 85 11.17 600 11.37 695 11.50 315 11.22 370 graph44 pedfile44 datafile44 23 19 21, 37, 61, 78 13.63 516 13.98 554 10.63 325 10.73 341 graph45 pedfile45 datafile45 30 18 15, 32, 48, 62, 79 12.89 380 12.52 439 12.10 220 11.92 264 graph46 pedfile46 datafile46 26 22 23, 54, 78 12.09 621 10.81 839 11.28 398 10.87 539 graph47 pedfile47 datafile47 21 33 10, 23, 36, 47, 63, 72, 88 5.15 2572 4.99 9475 5.06 1513 4.85 5736 graph48 pedfile48 datafile48 34 20 10, 29, 54, 69, 90 8.48 1130 7.72 2264 7.99 678 6.95 1421 graph49 pedfile49 datafile49 19 26 18, 35, 65, 87 20.42 170 19.69 183 16.18 111 16.12 119 graph50 pedfile50 datafile50 24 24 19, 25, 54, 70, 91 15.38 239 15.59 251 15.26 141 14.76 148 graph51 pedfile51 datafile51 25 28 15, 50, 86 18.57 146 17.64 167 16.67 95 15.59 107 graph52 pedfile52 datafile52 15 18 9, 20, 33, 44, 53, 78, 98 8.27 3829 8.29 4877 6.59 2540 6.52 3293 graph53 pedfile53 datafile53 23 33 10, 20, 40, 80 17.05 235 16.22 245 14.22 154 14.55 164 graph54 pedfile54 datafile54 19 40 89 6.60 2820 6.23 14032 6.79 1661 6.34 7260 graph55 pedfile55 datafile55 24 21 74 9.53 960 9.25 1425 9.38 560 9.32 833 graph56 pedfile56 datafile56 24 40 50 11.87 332 11.76 400 11.83 199 11.65 232 graph57 pedfile57 datafile57 25 20 27 4.83 3233 4.81 12811 4.78 1862 4.78 7254 graph58 pedfile58 datafile58 24 33 64 14.20 341 13.78 394 13.41 182 14.16 211 graph59 pedfile59 datafile59 22 22 14 3.36 6089 3.36 1311527 3.36 3554 3.36 744369 graph60 pedfile60 datafile60 14 20 6 3.19 9009 3.19 1294078 3.19 5297 3.19 743450 graph61 pedfile61 datafile61 27 16 3 3.69 18845 3.82 145720 3.69 10809 3.81 82540 graph62 pedfile62 datafile62 27 22 84 12.13 891 12.33 1316 11.39 504 11.39 748 graph63 pedfile63 datafile63 24 31 7 3.63 2418 3.63 1306060 3.63 1435 3.63 749090 graph64 pedfile64 datafile64 28 25 51 10.49 794 10.15 1045 9.89 466 9.05 597 graph65 pedfile65 datafile65 25 31 41 12.05 332 11.49 405 11.39 184 11.79 222 graph66 pedfile66 datafile66 26 45 75 8.67 672 8.45 1967 8.54 383 8.47 1101 graph67 pedfile67 datafile67 23 36 73 12.06 468 12.46 639 11.75 271 11.50 369 graph68 pedfile68 datafile68 15 26 77 4.11 5428 4.12 38726 4.09 3072 4.11 21133 graph69 pedfile69 datafile69 22 42 10 3.31 16105 3.61 348323 3.31 9594 3.61 207446 graph70 pedfile70 datafile70 20 21 96 3.96 6290 3.97 75238 3.94 3868 3.95 45068 graph71 pedfile71 datafile71 27 15 79 6.29 2567 6.33 3748 6.35 1510 6.26 2201 graph72 pedfile72 datafile72 29 41 44 10.37 732 9.26 1645 10.13 426 9.14 962 graph73 pedfile73 datafile73 21 24 10 3.24 7858 3.24 1288974 3.24 4850 3.24 797134 graph74 pedfile74 datafile74 21 19 52 3.35 5958 3.35 1306149 3.35 3665 3.35 788481 graph75 pedfile75 datafile75 26 38 58 7.10 912 6.96 2588 7.03 545 6.96 1526 graph76 pedfile76 datafile76 17 41 98 13.18 452 12.48 752 13.03 255 11.91 440 graph77 pedfile77 datafile77 24 35 50 16.07 348 15.07 545 14.67 202 13.86 333 graph78 pedfile78 datafile78 28 43 41 11.58 888 11.55 1977 10.85 534 10.61 1005 graph79 pedfile79 datafile79 28 26 37 8.24 1182 7.59 3425 8.11 727 7.39 2057 graph80 pedfile80 datafile80 24 27 52 3.66 2816 3.66 223139 3.66 1692 3.66 133853 graph81 pedfile81 datafile81 11 22 68 5.99 5348 5.75 23217 5.87 2926 5.63 12470 graph82 pedfile82 datafile82 21 27 95 5.90 1925 5.78 3656 5.73 1071 5.74 2500 graph83 pedfile83 datafile83 14 47 79 4.98 2095 4.95 4987 4.96 1129 4.94 2685 graph84 pedfile84 datafile84 15 22 93 6.48 3610 6.51 5640 6.39 2008 6.40 3097 graph85 pedfile85 datafile85 19 15 82 6.46 3977 6.24 7187 6.26 2317 6.22 4159 graph86 pedfile86 datafile86 28 38 90 4.27 1365 4.15 39776 4.30 785 4.15 22725 graph87 pedfile87 datafile87 10 20 45 5.54 5998 5.48 14932 5.69 3462 5.36 8506 graph88 pedfile88 datafile88 15 45 22 2.09 162067 2.09 1266975 2.09 98110 2.09 757484 graph89 pedfile89 datafile89 25 32 31 15.29 314 15.08 344 15.11 139 14.72 156 graph90 pedfile90 datafile90 12 34 76 4.68 2762 4.55 10122 4.66 1872 4.53 6737 graph91 pedfile91 datafile91 12 15 80 2.85 14328 2.85 679530 2.85 15322 2.85 755786 graph92 pedfile92 datafile92 25 49 3 3.89 524 3.89 670371 3.89 580 3.89 751241 graph93 pedfile93 datafile93 21 24 11 3.50 2310 3.50 703264 3.50 2444 3.50 752771 graph94 pedfile94 datafile94 21 42 12 7.47 982 6.60 2495 7.52 970 6.91 2424 graph95 pedfile95 datafile95 13 28 16 4.40 3070 4.25 16803 4.40 2967 4.25 16196 graph96 pedfile96 datafile96 13 42 33 8.13 1349 7.32 3705 8.48 1205 7.45 3116 graph97 pedfile97 datafile97 23 33 43 15.50 125 14.46 144 14.27 122 14.69 142 graph98 pedfile98 datafile98 27 22 44 6.29 1873 6.13 5823 6.34 1796 6.15 5412 graph99 pedfile99 datafile99 27 31 8 3.72 7480 3.83 88146 3.72 7259 3.83 85595

Conclusions

Experiments were conducted on graphs of a set of probabilistic networks taken from real-life applications. These experiments revealed that a subset of the identified reduction rules were able to reduce the graphs significantly. Therefore, preprocessing by applying reduction rules is a useful technique for the problem of constructing tree decompositions with minimum weighted treewidth.
In addition, one of the goals of this project was to find a good upper bound for the "buddies set" size and to check how often the Cube rule is applied. The Buddies rule with "buddies set" size = 4, and the Cube rule are not very useful, since they were not applied even once in all the experiments and hence the time to check their prerequisites is likely to be higher than the time saved on performing the iterations.

Acknowledgements
:

I thank Dan Geiger for his help and guidance. I also thank Ma'ayan Fishelson and the Research Unit of Decision Support Systems Aalborg University for providing instances of probabilistic networks.

Definitions, algorithms and some of the wordings were taken from the following references:

References

1. Hans L. Bodlaender, Arie M. C. A. Koster, Frank van den Eijkof, Linda C. van der Gaag: Pre-processing for Triangulation of Probabilistic Networks (ZIB Report 01-39, appeared in Proceedings of the 17th Conference on Uncertainty in Artificial Intelligence, J. Breese and D. Koller Eds., (2001), pp. 32-39, Published by Morgan Kaufmann Publishers, San Francisco)

2. Frank van den Eijkhof, Hans L. Bodlaender, Arie M.C.A. Koster: Safe reduction rules for weighted treewidth (Konrad-Zuse-Zentrum für Informationstechnik Berlin, ZIB PaperWeb Reports / Preprints 2002 )