net.sourceforge.jabm.learning
Class SoftMaxActionSelector

java.lang.Object
  extended by net.sourceforge.jabm.learning.SoftMaxActionSelector
All Implemented Interfaces:
ActionSelector

public class SoftMaxActionSelector
extends java.lang.Object
implements ActionSelector

An implementation of the softmax action selection policy.

See:
Sutton, R. S., Barto, A. G., 1998. Reinforcement Learning: An Introduction. MIT Press.

 

Field Summary
protected  cern.jet.random.engine.RandomEngine prng
           
protected  double tau
          The "temperature" used to modulate the propensity distribution.
 
Constructor Summary
SoftMaxActionSelector()
           
SoftMaxActionSelector(cern.jet.random.engine.RandomEngine prng, double tau)
           
 
Method Summary
 int act(int state, MDPLearner learner)
          Choose an action according to the current state and the current value estimates for each action.
 cern.jet.random.engine.RandomEngine getPrng()
           
 double getTau()
           
 void setPrng(cern.jet.random.engine.RandomEngine prng)
           
 void setTau(double tau)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

prng

protected cern.jet.random.engine.RandomEngine prng

tau

protected double tau
The "temperature" used to modulate the propensity distribution.

Constructor Detail

SoftMaxActionSelector

public SoftMaxActionSelector()

SoftMaxActionSelector

public SoftMaxActionSelector(cern.jet.random.engine.RandomEngine prng,
                             double tau)
Method Detail

act

public int act(int state,
               MDPLearner learner)
Description copied from interface: ActionSelector
Choose an action according to the current state and the current value estimates for each action.

Specified by:
act in interface ActionSelector
Parameters:
state - The current state of the MDP.
learner - The algorithm used to update the value estimates.
Returns:
An integer representing the action chosen (indexed from 0).

getPrng

public cern.jet.random.engine.RandomEngine getPrng()

setPrng

public void setPrng(cern.jet.random.engine.RandomEngine prng)

getTau

public double getTau()

setTau

public void setTau(double tau)
Parameters:
tau - The "temperature" used to modulate the propensities.