Node absorption is a net transform which removes nodes from a Bayes net or decision net, and makes any necessary adjustments to the resulting net, so that any inference done with it yields the same results as before the nodes were removed (except of course you can’t interact with the removed nodes). The local representation is changed, but the global relationships are not changed (as is the case with link reversal). In probability theory this is sometimes loosely called “summing out a variable”. It leaves the full joint probability distribution of the remaining nodes unchanged.
As an example, suppose you have a large net that has been constructed over time by a combination of expert assistance and probability learning. It shows the relationships between hundreds of variables, and contains much valuable information that could be used in a number of different applications.
Now you want to use it in an application where only 10 of the variables will be of interest. In every query of the new application, a particular 4 of these 10 will always have the same findings. For example, one of the nodes in the original net might by Gender, and in the restricted application the net will only be used for females, so you would like to enter a permanent finding of ‘female’ for the Gender node. These nodes are called context nodes. In each of the queries, you will be receiving new findings for 4 other nodes, and then you want the resulting beliefs of the remaining 2 out of 10. The nodes that will always have new findings are called findings nodes, and those whose beliefs you may want are called query nodes. The hundreds of other nodes in the net might be involved in intermediate calculations, but you don’t care about their values explicitly.
You can simplify the large net down to one with just 6 nodes using node absorption. First enter the permanent findings for the context nodes. Then select all the nodes to be absorbed (i.e. all the nodes except the findings and query nodes), and choose Modify → Absorb Nodes or click the toolbar button. The selected nodes will be removed, and some links may be added and/or reversed.
Order: If you want, you can absorb the nodes a few at a time, by selecting each group and clicking the toolbar button. The final result of absorbing a set of nodes is not dependent on the order in which they were absorbed, but the time and memory required may be greatly affected. If you have a set of nodes to absorb and you don’t know a good order to use, then it is best to absorb them all at once, so that Netica can pick a good order.
Returning to the example, the resulting 6 node net will give the same inference results as the original large one, for the restricted queries you will be making. If you are guaranteed that there will always be findings for every findings node, then you can further simplify things by removing any links that go from findings node P to findings node C, providing C does not have a query node as a parent. This means that if you can reverse links to make all the evidence nodes ancestors of all the query nodes, then you can remove all the links between the evidence nodes. Any findings node that is left completely disconnected by this operation is irrelevant to the query, and can be deleted. And now you can examine the CPTs of the query nodes to see directly how they depend on the findings. You may just be able to look up the desired probabilities without doing belief updating at all!
Complexity Danger: Even though
a reduced net has fewer nodes than the original, internally it may actually
be more complex, sometimes much more complex, if many links were added
during node absorbing or link reversing (remember that the size of a node’s
CPT can be exponential in its number of parents). Generally speaking,
absorbing out context nodes (i.e. nodes with findings entered) which have
many ancestor nodes results in the worst increase in complexity. The
next worst is absorbing out non-context nodes (i.e. nodes with no findings)
which have many descendant nodes. Absorbing out context nodes with
no ancestors, or non-context nodes with no descendants, will not add any
links. Of course, if the number of query and findings nodes is very
small and they have few states, the resulting net must be very simple,
although the transformations to generate it might temporarily require
a lot of memory.