rlgt.agents.random_action_mechanisms.ExponentialRandomActionMechanism

class documentation

class ExponentialRandomActionMechanism(RandomActionMechanism):

Constructor: ExponentialRandomActionMechanism(initial_random_action_probability, waiting_period, multiplicative_factor, maximum_random_action_probability)

View In Hierarchy

This class inherits from the RandomActionMechanism class and represents a random action mechanism with an exponential-style adaptation rule. An initial random action probability is specified at construction time. If the best score does not improve for a prescribed number of consecutive iterations, then the random action probability is increased multiplicatively, up to a fixed maximum threshold. Whenever a strict improvement in the best score is observed, the random action probability is reset to its initial value and the adaptation process restarts.

Method	`__init__`	This constructor initializes the random action mechanism with the parameters governing its exponential adaptation behavior.
Method	`reset`	This abstract method must be implemented by any concrete subclass. It is invoked by an RL agent during the initialization process and it must initialize or reset all internal state maintained by the random action mechanism.
Method	`step`	This abstract method must be implemented by any concrete subclass. It is invoked by an RL agent at the end of each iteration of the learning process and it must update the internal state of the random action mechanism based on the previous best score and the current best score.
Property	`random_action_probability`	This abstract property must be implemented by any concrete subclass. It must return the current random action probability as a `float` value from the interval [0, 1].
Instance Variable	`__counter`	A nonnegative `int` that counts the number of iterations since the last improvement in the best score or the last update of the random action probability.
Instance Variable	`__initial_random_action_probability`	A `float` from the interval [0, 1] that represents the initial random action probability.
Instance Variable	`__maximum_random_action_probability`	A `float` from the interval [0, 1] that represents the maximum allowable value of the random action probability.
Instance Variable	`__multiplicative_factor`	A `float` greater than 1 that specifies the factor by which the random action probability is multiplied when an increase is triggered.
Instance Variable	`__random_action_probability`	A `float` from the interval [0, 1] that represents the current random action probability.
Instance Variable	`__waiting_period`	A positive `int` that specifies how many consecutive iterations without an improvement in the best score are required before the random action probability is increased.

def __init__(self, initial_random_action_probability: float, waiting_period: int, multiplicative_factor: float, maximum_random_action_probability: float): ¶

This constructor initializes the random action mechanism with the parameters governing its exponential adaptation behavior.

Parameters
initial_random_action_probability:`float`	The initial random action probability, given as a `float` from the interval [0, 1].
waiting_period:`int`	The number of consecutive iterations without an improvement in the best score that are required before the random action probability is increased, given as a positive `int`.
multiplicative_factor:`float`	The multiplicative factor applied to the random action probability when an increase is triggered, given as a `float` greater than 1.
maximum_random_action_probability:`float`	The maximum allowable value of the random action probability, given as a `float` from the interval [0, 1].

def reset(self): ¶

overrides rlgt.agents.random_action_mechanisms.RandomActionMechanism.reset

This abstract method must be implemented by any concrete subclass. It is invoked by an RL agent during the initialization process and it must initialize or reset all internal state maintained by the random action mechanism.

def step(self, previous_best_score: float, current_best_score: float): ¶

overrides rlgt.agents.random_action_mechanisms.RandomActionMechanism.step

This abstract method must be implemented by any concrete subclass. It is invoked by an RL agent at the end of each iteration of the learning process and it must update the internal state of the random action mechanism based on the previous best score and the current best score.

Parameters
previous_best_score:`float`	The value of the best score before the current iteration, given as a `float`.
current_best_score:`float`	The value of the best score after the current iteration, given as a `float`.

@property

random_action_probability: float = ¶

overrides rlgt.agents.random_action_mechanisms.RandomActionMechanism.random_action_probability

This abstract property must be implemented by any concrete subclass. It must return the current random action probability as a float value from the interval [0, 1].

__counter: int = ¶

A nonnegative int that counts the number of iterations since the last improvement in the best score or the last update of the random action probability.

__initial_random_action_probability: float = ¶

A float from the interval [0, 1] that represents the initial random action probability.

__maximum_random_action_probability: float = ¶

A float from the interval [0, 1] that represents the maximum allowable value of the random action probability.

__multiplicative_factor: float = ¶

A float greater than 1 that specifies the factor by which the random action probability is multiplied when an increase is triggered.

__random_action_probability: float = ¶

A float from the interval [0, 1] that represents the current random action probability.

__waiting_period: int = ¶

A positive int that specifies how many consecutive iterations without an improvement in the best score are required before the random action probability is increased.