QLearner

Overview

Package

Class

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

net.sourceforge.jabm.learning
Class QLearner

java.lang.Object
  net.sourceforge.jabm.learning.AbstractLearner
      net.sourceforge.jabm.learning.QLearner

All Implemented Interfaces:: java.io.Serializable, java.lang.Cloneable, DiscreteLearner, Learner, MDPLearner, Prototypeable, Resetable, org.springframework.beans.factory.InitializingBean

public class QLearner
extends AbstractLearner
implements MDPLearner, Resetable, org.springframework.beans.factory.InitializingBean, java.io.Serializable, Prototypeable
extends AbstractLearner
implements MDPLearner, Resetable, org.springframework.beans.factory.InitializingBean, java.io.Serializable, Prototypeable

An implementation of the Q-learning algorithm. This algorithm is described in Watkins, J. C. H., Dayan, P., 1992. Q-learning. Machine Learning 8, 279-292.

See Also:

Serialized Form

Field Summary
`protected ActionSelector`	`actionSelector`
`protected int`	`bestAction` The best action for the current state
`protected int`	`currentState` The current state
`protected double`	`discountRate` The discount rate for future payoffs.
`protected double`	`initialQValue`
`protected int`	`lastActionChosen` The last action that was chosen.
`protected double`	`learningRate` The learning rate.
`protected int`	`numActions` The number of possible actions
`protected int`	`numStates` The number of possible states
`protected int`	`previousState` The previous state
`protected cern.jet.random.engine.RandomEngine`	`prng`
`protected double[][]`	`q` The matrix representing the estimated payoff of each possible action in each possible state.

Fields inherited from class net.sourceforge.jabm.learning.AbstractLearner
`monitor`

Constructor Summary
`QLearner()`
`QLearner(int numStates, int numActions, double learningRate, double discountRate, cern.jet.random.engine.RandomEngine prng)`
`QLearner(cern.jet.random.engine.RandomEngine prng)`

Method Summary
`int`	`act()` Request that the learner perform an action.
`void`	`afterPropertiesSet()`
`int`	`bestAction(int state)`
`void`	`dumpState(DataWriter out)` Write out our state data to the specified data writer.
`ActionSelector`	`getActionSelector()`
`double`	`getDiscountRate()`
`double`	`getInitialQValue()`
`int`	`getLastActionChosen()`
`double`	`getLearningDelta()` Return a value indicative of the amount of learning that occured during the last iteration.
`double`	`getLearningRate()`
`int`	`getNumberOfActions()` Get the number of different possible actions this learner can choose from when it performs an action.
`int`	`getNumberOfStates()`
`int`	`getPreviousState()`
`cern.jet.random.engine.RandomEngine`	`getPrng()`
`int`	`getState()`
`double`	`getValueEstimate(int action)`
`double[]`	`getValueEstimates(int state)`
`void`	`initialise()`
`double`	`maxQ(int newState)`
`void`	`newState(double reward, int newState)` The call-back after performing an action.
`java.lang.Object`	`protoClone()`
`void`	`reset()` Reinitialise our state to the original settings.
`void`	`setActionSelector(ActionSelector actionSelector)`
`void`	`setDiscountRate(double discountRate)`
`void`	`setInitialQValue(double initialQValue)`
`void`	`setLearningRate(double learningRate)`
`void`	`setNumberOfActions(int numActions)`
`void`	`setNumberOfStates(int numStates)`
`void`	`setPrng(cern.jet.random.engine.RandomEngine prng)`
`void`	`setState(int newState)`
`void`	`setStatesAndActions(int numStates, int numActions)`
`java.lang.String`	`toString()`
`protected void`	`updateQ(double reward, int newState)`
`int`	`worstAction(int state)`

Methods inherited from class net.sourceforge.jabm.learning.AbstractLearner
`monitor`

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait`

Methods inherited from interface net.sourceforge.jabm.learning.Learner
`monitor`

Field Detail

numStates

protected int numStates

The number of possible states

numActions

protected int numActions

The number of possible actions

q

protected double[][] q

The matrix representing the estimated payoff of each possible action in each possible state.

learningRate

protected double learningRate

The learning rate.

discountRate

protected double discountRate

The discount rate for future payoffs.

previousState

protected int previousState

The previous state

currentState

protected int currentState

The current state

lastActionChosen

protected int lastActionChosen

The last action that was chosen.

bestAction

protected int bestAction

The best action for the current state

prng

protected cern.jet.random.engine.RandomEngine prng

actionSelector

protected ActionSelector actionSelector

initialQValue

protected double initialQValue

Constructor Detail

QLearner

public QLearner(int numStates,
                int numActions,
                double learningRate,
                double discountRate,
                cern.jet.random.engine.RandomEngine prng)

QLearner

public QLearner(cern.jet.random.engine.RandomEngine prng)

QLearner

public QLearner()

Method Detail

protoClone

public java.lang.Object protoClone()

Specified by:: protoClone in interface Prototypeable

initialise

public void initialise()

setStatesAndActions

public void setStatesAndActions(int numStates,
                                int numActions)

setState

public void setState(int newState)

getState

public int getState()

act

public int act()

Description copied from interface: DiscreteLearner

Request that the learner perform an action. Users of the learning algorithm should invoke this method on the learner when they wish to find out which action the learner is currently recommending.

Specified by:: act in interface DiscreteLearner

Returns:: An integer representing the action to be taken.

newState

public void newState(double reward,
                     int newState)

Description copied from interface: MDPLearner

The call-back after performing an action.

Specified by:: newState in interface MDPLearner

Parameters:: reward - The reward received from taking the most recently-selected action.; newState - The new state encountered after taking the most recently-selected action.

updateQ

protected void updateQ(double reward,
                       int newState)

maxQ

public double maxQ(int newState)

worstAction

public int worstAction(int state)

bestAction

public int bestAction(int state)

Specified by:: bestAction in interface MDPLearner

reset

public void reset()

Description copied from interface: Resetable

Reinitialise our state to the original settings.

Specified by:: reset in interface Resetable

setDiscountRate

public void setDiscountRate(double discountRate)

getDiscountRate

public double getDiscountRate()

getLastActionChosen

public int getLastActionChosen()

getLearningDelta

public double getLearningDelta()

Description copied from interface: Learner

Return a value indicative of the amount of learning that occured during the last iteration. Values close to 0.0 indicate that the learner has converged to an equilibrium state.

Specified by:: getLearningDelta in interface Learner
Specified by:: getLearningDelta in class AbstractLearner

Returns:: A double representing the amount of learning that occured.

dumpState

public void dumpState(DataWriter out)

Description copied from interface: Learner

Write out our state data to the specified data writer.

Specified by:: dumpState in interface Learner
Specified by:: dumpState in class AbstractLearner

getNumberOfActions

public int getNumberOfActions()

Description copied from interface: DiscreteLearner

Get the number of different possible actions this learner can choose from when it performs an action.

Specified by:: getNumberOfActions in interface DiscreteLearner
Specified by:: getNumberOfActions in interface MDPLearner

Returns:: An integer value representing the number of actions available.

getLearningRate

public double getLearningRate()

setLearningRate

public void setLearningRate(double learningRate)

getNumberOfStates

public int getNumberOfStates()

Specified by:: getNumberOfStates in interface MDPLearner

setNumberOfStates

public void setNumberOfStates(int numStates)

setNumberOfActions

public void setNumberOfActions(int numActions)

getPreviousState

public int getPreviousState()

getPrng

public cern.jet.random.engine.RandomEngine getPrng()

setPrng

public void setPrng(cern.jet.random.engine.RandomEngine prng)

getActionSelector

public ActionSelector getActionSelector()

setActionSelector

public void setActionSelector(ActionSelector actionSelector)

toString

public java.lang.String toString()

Overrides:: toString in class java.lang.Object

getValueEstimate

public double getValueEstimate(int action)

setInitialQValue

public void setInitialQValue(double initialQValue)

getInitialQValue

public double getInitialQValue()

getValueEstimates

public double[] getValueEstimates(int state)

Specified by:: getValueEstimates in interface MDPLearner

Parameters:: state - The current state of the MDP.
Returns:: An array representing the Q values indexed by action.

afterPropertiesSet

public void afterPropertiesSet()
                        throws java.lang.Exception

Specified by:: afterPropertiesSet in interface org.springframework.beans.factory.InitializingBean

Throws:: java.lang.Exception

Overview

Package

Class

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

net.sourceforge.jabm.learning Class QLearner

numStates

numActions

q

learningRate

discountRate

previousState

currentState

lastActionChosen

bestAction

prng

actionSelector

initialQValue

QLearner

QLearner

QLearner

protoClone

initialise

setStatesAndActions

setState

getState

act

newState

updateQ

maxQ

worstAction

bestAction

reset

setDiscountRate

getDiscountRate

getLastActionChosen

getLearningDelta

dumpState

getNumberOfActions

getLearningRate

setLearningRate

getNumberOfStates

setNumberOfStates

setNumberOfActions

getPreviousState

getPrng

setPrng

getActionSelector

setActionSelector

toString

getValueEstimate

setInitialQValue

getInitialQValue

getValueEstimates

afterPropertiesSet

net.sourceforge.jabm.learning
Class QLearner