class documentation

This abstract class encapsulates the concept of a reinforcement learning agent for graph theory applications. Such an agent is intended to solve extremal problems in which a given graph invariant is to be maximized over a finite family of fully colored k-edge-colored looped complete graphs. The agent interacts iteratively with a reinforcement learning environment. The environment encodes the extremal problem, including the state representation, the available actions and the transition dynamics, while the agent is responsible for steering the learning process through successive decisions.

Any concrete subclass must implement the following abstract methods:

  1. reset, which initializes or re-initializes the agent and prepares it to start the learning process; and
  2. step, which performs a single iteration of the learning process.

In addition, any concrete subclass must implement the following abstract properties:

  1. step_count, which returns the number of executed learning iterations;
  2. best_score, which returns the best value of the target graph invariant achieved so far; and
  3. best_graph, which returns a graph attaining that best value.
Method reset This abstract method must be implemented by any concrete subclass. It must initialize the agent and prepare it to begin the learning process. If the agent has been used previously, invoking this method must reset all internal state so that the learning restarts from scratch.
Method step This abstract method must be implemented by any concrete subclass. It must perform a single iteration of the learning process, which may involve one or more interactions between the agent and the environment...
Property best_graph This abstract property must be implemented by any concrete subclass. It must return a graph attaining the best value of the target graph invariant achieved so far. If at least one learning iteration has been executed, the result must be returned as a ...
Property best_score This abstract property must be implemented by any concrete subclass. It must return the best value of the target graph invariant achieved so far. If at least one learning iteration has been executed, the value is returned as a ...
Property step_count This abstract property must be implemented by any concrete subclass. It must return the number of learning iterations executed so far. If the agent has been initialized, the returned value must be a nonnegative ...
def reset(self):

This abstract method must be implemented by any concrete subclass. It must initialize the agent and prepare it to begin the learning process. If the agent has been used previously, invoking this method must reset all internal state so that the learning restarts from scratch.

def step(self):

This abstract method must be implemented by any concrete subclass. It must perform a single iteration of the learning process, which may involve one or more interactions between the agent and the environment. This iteration should update the agent's internal state and improve its policy or decision-making strategy based on the observed outcomes.

This abstract property must be implemented by any concrete subclass. It must return a graph attaining the best value of the target graph invariant achieved so far. If at least one learning iteration has been executed, the result must be returned as a Graph object. Otherwise, if no iterations have been executed or the agent has not been initialized, the value None must be returned.

This abstract property must be implemented by any concrete subclass. It must return the best value of the target graph invariant achieved so far. If at least one learning iteration has been executed, the value is returned as a float. If the agent has been initialized but no iterations have yet been executed, the value −∞ must be returned. If the agent has not been initialized, the value None must be returned.

This abstract property must be implemented by any concrete subclass. It must return the number of learning iterations executed so far. If the agent has been initialized, the returned value must be a nonnegative int. If the agent has not yet been initialized, the value None must be returned.