SoftMaxActionSelector

Overview

Package

Class

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

net.sourceforge.jabm.learning
Class SoftMaxActionSelector

java.lang.Object
  net.sourceforge.jabm.learning.SoftMaxActionSelector

All Implemented Interfaces:: ActionSelector

public class SoftMaxActionSelector
extends java.lang.Object
implements ActionSelector
extends java.lang.Object
implements ActionSelector

An implementation of the softmax action selection policy.

See:
Sutton, R. S., Barto, A. G., 1998. Reinforcement Learning: An Introduction. MIT Press.

Field Summary
`protected cern.jet.random.engine.RandomEngine`	`prng`
`protected double`	`tau` The "temperature" used to modulate the propensity distribution.

Constructor Summary
`SoftMaxActionSelector()`
`SoftMaxActionSelector(cern.jet.random.engine.RandomEngine prng, double tau)`

Method Summary
`int`	`act(int state, MDPLearner learner)` Choose an action according to the current state and the current value estimates for each action.
`cern.jet.random.engine.RandomEngine`	`getPrng()`
`double`	`getTau()`
`void`	`setPrng(cern.jet.random.engine.RandomEngine prng)`
`void`	`setTau(double tau)`

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Field Detail

prng

protected cern.jet.random.engine.RandomEngine prng

tau

protected double tau

The "temperature" used to modulate the propensity distribution.

Constructor Detail

SoftMaxActionSelector

public SoftMaxActionSelector()

SoftMaxActionSelector

public SoftMaxActionSelector(cern.jet.random.engine.RandomEngine prng,
                             double tau)

Method Detail

act

public int act(int state,
               MDPLearner learner)

Description copied from interface: ActionSelector

Choose an action according to the current state and the current value estimates for each action.

Specified by:: act in interface ActionSelector

Parameters:: state - The current state of the MDP.; learner - The algorithm used to update the value estimates.
Returns:: An integer representing the action chosen (indexed from 0).

getPrng

public cern.jet.random.engine.RandomEngine getPrng()

setPrng

public void setPrng(cern.jet.random.engine.RandomEngine prng)

getTau

public double getTau()

setTau

public void setTau(double tau)

Parameters:: tau - The "temperature" used to modulate the propensities.

Overview

Package

Class

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

net.sourceforge.jabm.learning Class SoftMaxActionSelector

prng

tau

SoftMaxActionSelector

SoftMaxActionSelector

act

getPrng

setPrng

getTau

setTau

net.sourceforge.jabm.learning
Class SoftMaxActionSelector