: Class Learner

NETICA API

JAVA VERSION 5.04

Overview

Package

Class

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: INNER | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

norsys.netica
Class Learner

java.lang.Object
  |
  +--norsys.netica.Learner

public class Learner
extends java.lang.Object

An object for managing batch-mode learning, such as EM or Gradient Descent learning, of CPTs from case data.

Currently only batch-mode learning is supported, but it is intended that in future, all modes of learning will be managed by this class.

Since:: 2.27
Version:: 5.04 - January 21, 2012

Field Summary

static int COUNTING_LEARNING

Indicates the case counting learning algorithm.

static int EM_LEARNING

Indicates the EM (Expectation Maximization) learning algorithm.

static int GRADIENT_DESCENT_LEARNING

Indicates the Gradient Descent learning algorithm.

Constructor Summary

Learner(int method)

Creates and returns a new Learner object for use in learning of CPTs from case data, and associates it with the default Netica environment.

Learner(int method, java.lang.String info, Environ env)

Creates and returns a new Learner object for use in learning of CPTs from case data, and associates it with a given Netica environment.

Method Summary

void finalize()

Removes the Learner object and frees all its resources (e.g., memory).

Environ getEnviron()

Returns the Environ that this object belongs to.

int getMaxIterations()

Returns the maximum number of learning-step iterations for learnCPTs.

double getMaxTolerance()

Returns the tolerance for the minimum change in data log likelihood between consecutive passes through the data, as a termination condition for any learning to be done by learner.

int getMethod()

Returns the algorithmic method used by this learner, one of COUNTING_LEARNING, EM_LEARNING, or GRADIENT_DESCENT_LEARNING.

void learnCPTs(NodeList nodeList, Caseset caseset, double degree)

Performs learning of CPT tables from data.

void setMaxIterations(int maxIterations)

Sets the maximum number of learning-step iterations (i.e., complete passes through the data) which will be done when learner is used, after which learning will be automatically terminated.

void setMaxTolerance(double logLikelihoodTolerance)

Sets the tolerance for the minimum change in data log likelihood between consecutive passes through the data, as a termination condition for any learning to be done by learner.

Methods inherited from class java.lang.Object

clone, equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Detail

public static int COUNTING_LEARNING

Indicates the case counting learning algorithm. Pass this to a Learner constructor.

public static int EM_LEARNING

Indicates the EM (Expectation Maximization) learning algorithm. Pass this to a Learner constructor.

public static int GRADIENT_DESCENT_LEARNING

Indicates the Gradient Descent learning algorithm. Pass this to a Learner constructor.

Constructor Detail

public Learner (

int

method

) throws NeticaException

Creates and returns a new Learner object for use in learning of CPTs from case data, and associates it with the default Netica environment.

After creating this object, you use it to set the learning parameters you want, and then you pass it to a learning method, such as learnCPTs, to actually perform the learning on some net using some data file.

method should be one of COUNTING_LEARNING, EM_LEARNING, or GRADIENT_DESCENT_LEARNING. See learnCPTs for a description of how each learning algorithm operates.

This method is identical to calling new Learner(method,null,Environ.getDefaultEnviron()).

Parameters:
int method The type of learning algorithm that this Learner should use: one of COUNTING_LEARNING, EM_LEARNING, or GRADIENT_DESCENT_LEARNING.

Version:

Versions 2.26 and later have this method.

See Also:

Learner(int,Environ)		Same, but for any environment
setMaxIterations		Set the maximum number of iterations (if applicable) it will do when learning
setMaxTolerance		Set the maximum tolerance (if applicable) it will allow before termination
learnCPTs		Performs the learning
finalize		Discard the Learner

public Learner (

int	method,
String	info,
Environ	env

) throws NeticaException

Creates and returns a new Learner object for use in learning of CPTs from case data, and associates it with a given Netica environment.

Pass null for options; it is only for future expansion.

method must be one of COUNTING_LEARNING, EM_LEARNING, or GRADIENT_DESCENT_LEARNING. See learnCPTs for a description of how each learning algorithm operates.

Parameters:
int      method      The type of learning method that this Learner should use: one of COUNTING_LEARNING, EM_LEARNING, or GRADIENT_DESCENT_LEARNING.
String      options      For future expandability. Pass null for now.
Environ      env      The Environ in which this new Learner will be placed.

Version:

Versions 2.26 and later have this method.
In the C Version of the API, this function is named NewLearner_bn.

See Also:

setMaxIterations		Set the maximum number of iterations (if applicable) it will do when learning
setMaxTolerance		Set the maximum tolerance (if applicable) it will allow before termination
learnCPTs		Performs the learning
finalize		Discard the Learner
RandomGenerator		May also want this to control randomization

Example:

See addCases

Method Detail

public void finalize ( ) throws NeticaException

Removes the Learner object and frees all its resources (e.g., memory).

Version:

Versions 2.26 and later have this method.
In the C Version of the API, this function is named DeleteLearner_bn.

See Also:

Learner

Create a new Learner

Overrides:: finalize in class java.lang.Object

public Environ getEnviron ( )

Returns the Environ that this object belongs to.

Version:

Versions 2.26 and later have this method.

public int getMaxIterations ( ) throws NeticaException

Returns the maximum number of learning-step iterations for learnCPTs.

See setMaxIterations for additional documentation.

Version:

Versions 2.26 and later have this method.
In the C Version of the API, this function is named GetLearnerMaxIters_bn.

See Also:

Learner		Create a new Learner
setMaxIterations		Sets it
getMaxTolerance		Retrieves another termination parameter
learnCPTs		Performs the learning iterations

public double getMaxTolerance ( ) throws NeticaException

Returns the tolerance for the minimum change in data log likelihood between consecutive passes through the data, as a termination condition for any learning to be done by learner. This applies to EM_LEARNING and GRADIENT_DESCENT_LEARNING only, since they are iterative by nature.

See setMaxTolerance for additional documentation.

Version:

Versions 2.26 and later have this method.
In the C Version of the API, this function is named GetLearnerMaxTol_bn.

See Also:

setMaxTolerance		Sets it
getMaxIterations		Retrieves another termination parameter
learnCPTs		Performs the learning

public int getMethod ( )

Returns the algorithmic method used by this learner, one of COUNTING_LEARNING, EM_LEARNING, or GRADIENT_DESCENT_LEARNING. This method is originally set in the Learner's constructor (see Learner).

Version:

Versions 2.26 and later have this method.

public void learnCPTs (

NodeList	nodeList,
Caseset	caseset,
double	degree

) throws NeticaException

Performs learning of CPT tables from data. For EM or gradient descent algorithms this is done until a termination condition is met.

nodeList is the list of nodes whose experience and conditional probability tables are to be updated by learning. They must all be from the same net. Other nodes in that net will not be modified.

cases is the set of cases to be used for learning.

degree is the frequency factor to apply to each case in the case set. It must be greater than zero. It gets multiplied by the "NumCases" (multiplicity number) which appears for each case in the file (if the number doesn't appear in the file, it is taken as 1).

When you create the Learner (see Learner), you choose the algorithm you wish, which may be one of:

1. Counting Learning This is traditional one-pass learning (see Net.reviseCPTsByFindings) ... . It is the preferred learning method to use, if there are no hidden (also known as 'latent') nodes in the net and no missing values in the case data. If there are hidden variables, that is, variables for which you have no observations, but you suspect exist and can be useful for modeling your world, or if there are a substantial number of missing values in the case data, then the iterative learning algorithms may yield better results.
Because this learning method is not iterative, setMaxIterations and setMaxTolerance have no affect on it.

2. EM Learning EM learning optimizes the net's CPTs using the well known expectation maximization algorithm, in an attempt to maximize the probability of the data set given the net (i.e., minimize negative log likelihood of the data). If the nodes have CPT and experience tables before the learning starts, they will be considered as part of the data (properly weighted using the experience table), so the knowledge from the data set is combined with the knowledge already in the net. If you do not want this effect, be sure to delete the tables first (see deleteTables). During EM learning, for each case in the case file, only the CPTs of nodes with findings and their ancestor nodes become modified, so only those nodes will have their experience tables incremented.

3. Gradient Descent Learning Gradient descent learning works similar to EM learning, but it uses a very different algorithm internally. It uses a conjugate gradient descent to maximize the probability of the data, given the net, by adjusting the CPT table entries. Generally speaking, this algorithm converges faster than EM learning, but may be more susceptible to local maxima. It has similarities to the neural net back propagation algorithm.

After the Learner is created, you can set the termination conditions for it. For both EM learning and gradient descent learning, the two possible termination conditions are the maximum number of iterations of the whole batch of cases (see setMaxIterations), and the minimum change in log likelihood from one pass through the batch to the next (see setMaxTolerance). Termination will occur when either of the two conditions are met. For Counting learning, there currently are no termination conditions to set.

Parameters:
NodeList      nodeList      The list of nodes from the net whose case data will be used for learning the remainder of the net.
Caseset      caseset      The case set whose cases will be used for learning.
double      degree      The frequency factor to apply to each case in the case set.

Version:

Versions 2.26 and later have this method.
In the C Version of the API, this function is named LearnCPTs_bn.

See Also:

Learner		Creates the learner
setMaxIterations		Sets a learning termination parameter: the maximum number of batch iterations
setMaxTolerance		Sets a learning termination parameter: the minimum log likelihood increase
Caseset		Creates the Caseset
reviseCPTsByCaseFile		Uses a different learning algorithm (better suited if there is little missing data)
deleteTables		May want to do this before learning

Example:

See addCases

public void setMaxIterations (

int	maxIterations

) throws NeticaException

Sets the maximum number of learning-step iterations (i.e., complete passes through the data) which will be done when learner is used, after which learning will be automatically terminated. This applies to EM_LEARNING and GRADIENT_DESCENT_LEARNING only, since they are iterative by nature. Learning by the COUNTING_LEARNING method is not affected by this method.

Learning may be terminated earlier, if it first reaches another limit, such as learner's maximum tolerance limit (see setMaxTolerance).

maxIterations must be greater than 0. The default is 1000.

Parameters:
int maxIterations The maximum number of learning-step iterations that learnCPTs is allowed to perform.

Version:

Versions 2.26 and later have this method.
In the C Version of the API, this function is named SetLearnerMaxIters_bn.

See Also:

Learner		Creates the Learner
setMaxTolerance		Sets another termination parameter
getMaxIterations		Retrieves value
learnCPTs		Performs the learning using this parameter

public void setMaxTolerance (

double

logLikelihoodTolerance

) throws NeticaException

Sets the tolerance for the minimum change in data log likelihood between consecutive passes through the data, as a termination condition for any learning to be done by learner. This applies to EM_LEARNING and GRADIENT_DESCENT_LEARNING only, since they are iterative by nature. Learning by the COUNTING_LEARNING method is not affected by this method.

When learning is performed, with each iteration (i.e., pass through the complete data set), the "log likelihood" of the data given the net is computed. The log likelihood is the per-case average of the negative of the logarithm of the probability of the case given the current Bayes net (structure + CPTs). When the difference between the computed log-likelihoods for two consecutive passes falls below this tolerance, the algorithm is halted. So, the closer this tolerance is to zero, the longer the algorithm may take.

The algorithm may terminate earlier if another termination condition is met, such as the maximum number of iterations (see setMaxIterations).

logLikelihoodTolerance must be greater than 0.0. The default is 1.0e-5.

Parameters:
double logLikelihoodTolerance The value to make this tolerance.

Version:

Versions 2.26 and later have this method.
In the C Version of the API, this function is named SetLearnerMaxTol_bn.

See Also:

setMaxIterations		Sets another termination parameter
getMaxTolerance		Retrieves value
learnCPTs		Performs the learning using this parameter

Overview

Package

Class

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: INNER | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

Field Summary
`static int`	`COUNTING_LEARNING` Indicates the case counting learning algorithm.
`static int`	`EM_LEARNING` Indicates the EM (Expectation Maximization) learning algorithm.
`static int`	`GRADIENT_DESCENT_LEARNING` Indicates the Gradient Descent learning algorithm.

Constructor Summary
`Learner(int method)` Creates and returns a new Learner object for use in learning of CPTs from case data, and associates it with the default Netica environment.
`Learner(int method, java.lang.String info, Environ env)` Creates and returns a new Learner object for use in learning of CPTs from case data, and associates it with a given Netica environment.

Method Summary
`void`	`finalize()` Removes the Learner object and frees all its resources (e.g., memory).
`Environ`	`getEnviron()` Returns the Environ that this object belongs to.
`int`	`getMaxIterations()` Returns the maximum number of learning-step iterations for learnCPTs.
`double`	`getMaxTolerance()` Returns the tolerance for the minimum change in data log likelihood between consecutive passes through the data, as a termination condition for any learning to be done by learner.
`int`	`getMethod()` Returns the algorithmic method used by this learner, one of COUNTING_LEARNING, EM_LEARNING, or GRADIENT_DESCENT_LEARNING.
`void`	`learnCPTs(NodeList nodeList, Caseset caseset, double degree)` Performs learning of CPT tables from data.
`void`	`setMaxIterations(int maxIterations)` Sets the maximum number of learning-step iterations (i.e., complete passes through the data) which will be done when learner is used, after which learning will be automatically terminated.
`void`	`setMaxTolerance(double logLikelihoodTolerance)` Sets the tolerance for the minimum change in data log likelihood between consecutive passes through the data, as a termination condition for any learning to be done by learner.

int	method	The type of learning method that this Learner should use: one of COUNTING_LEARNING, EM_LEARNING, or GRADIENT_DESCENT_LEARNING.
String	options	For future expandability. Pass null for now.
Environ	env	The Environ in which this new Learner will be placed.

NodeList	nodeList	The list of nodes from the net whose case data will be used for learning the remainder of the net.
Caseset	caseset	The case set whose cases will be used for learning.
double	degree	The frequency factor to apply to each case in the case set.

norsys.netica Class Learner

norsys.netica
Class Learner