net.sourceforge.jabm.learning
Class SoftMaxActionSelector
java.lang.Object
net.sourceforge.jabm.learning.SoftMaxActionSelector
- All Implemented Interfaces:
- ActionSelector
public class SoftMaxActionSelector
- extends java.lang.Object
- implements ActionSelector
An implementation of the softmax action selection policy.
See:
Sutton, R. S., Barto, A. G., 1998. Reinforcement Learning: An Introduction.
MIT Press.
-
-
Field Summary |
protected cern.jet.random.engine.RandomEngine |
prng
|
protected double |
tau
The "temperature" used to modulate the propensity distribution. |
Method Summary |
int |
act(int state,
MDPLearner learner)
Choose an action according to the current state and the
current value estimates for each action. |
cern.jet.random.engine.RandomEngine |
getPrng()
|
double |
getTau()
|
void |
setPrng(cern.jet.random.engine.RandomEngine prng)
|
void |
setTau(double tau)
|
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
prng
protected cern.jet.random.engine.RandomEngine prng
tau
protected double tau
- The "temperature" used to modulate the propensity distribution.
SoftMaxActionSelector
public SoftMaxActionSelector()
SoftMaxActionSelector
public SoftMaxActionSelector(cern.jet.random.engine.RandomEngine prng,
double tau)
act
public int act(int state,
MDPLearner learner)
- Description copied from interface:
ActionSelector
- Choose an action according to the current state and the
current value estimates for each action.
- Specified by:
act
in interface ActionSelector
- Parameters:
state
- The current state of the MDP.learner
- The algorithm used to update the value estimates.
- Returns:
- An integer representing the action chosen (indexed from 0).
getPrng
public cern.jet.random.engine.RandomEngine getPrng()
setPrng
public void setPrng(cern.jet.random.engine.RandomEngine prng)
getTau
public double getTau()
setTau
public void setTau(double tau)
- Parameters:
tau
- The "temperature" used to modulate the propensities.