class documentation

This class inherits from the GraphEnvironment class and models a graph building game in which the edges (resp. arcs) are initially fully colored in some manner, and an agent moves between vertices, thereby traversing existing edges (resp. arcs) and properly recoloring them. More precisely, at each step the agent is located at a vertex and selects an edge incident to this vertex (resp. an arc starting at this vertex), traverses it, and moves to the other endpoint. During traversal, the selected edge (resp. arc) is properly recolored with a chosen color. The user can select the graph order, the number of proper edge colors, whether the graphs are directed or undirected, and whether loops are allowed. Additionally, the mechanism controlling how the initial fully colored graphs are generated can be configured and may be deterministic or nondeterministic. The user can also select the vertex at which the agent starts the recoloring procedure.

The RL tasks in this environment are continuing, and the total number of actions to be performed, i.e., the episode length, is configurable.

Each state is represented by a binary numpy.ndarray of type numpy.uint8 and length (edge_colors - 1) * flattened_length + graph_order, where edge_colors is the configured number of proper edge colors, graph_order is the configured graph order, and flattened_length is the flattened length of the graphs to be constructed. In the state vectors, the first flattened_length bits indicate which edges (resp. arcs) are currently of color 1, the next flattened_length bits indicate which edges (resp. arcs) are currently of color 2, and this pattern continues up to color edge_colors - 1. The ordering of edges (resp. arcs) within these blocks is determined by the selected flattened ordering, which can be either row-major or clockwise as specified by the FlattenedOrdering enumeration. The final graph_order bits form a one-hot encoding that specifies the vertex at which the agent is currently located.

Each action is represented by a numpy.int32 integer between 0 and edge_colors * graph_order - 1. If the action value is a, then a % graph_order determines the vertex to which the agent should move from its current location, while a // graph_order determines the color with which the traversed edge (resp. arc) is properly recolored. If loops are not allowed, then actions that would keep the agent at the current vertex are invalid.

Method __init__ This constructor initializes an instance of the LocalSetEnvironment class.
Method episode_length.setter This setter allows the user to potentially reconfigure the episode length between two independent batches of episodes. It should not be used while a batch of episodes is currently in progress.
Method state_batch_to_graph_batch This abstract method must be implemented by any concrete subclass. It extracts the batch of underlying graphs corresponding to a provided batch of states. Implementations must return a Graph object containing the graphs corresponding to each row in ...
Instance Variable initial_graph_generator A GraphGenerator function that defines how the underlying fully colored graphs are generated for the initial states. This attribute may be reconfigured between independent batches of episodes.
Instance Variable starting_vertex A nonnegative int below the configured graph order that determines the vertex at which the agent should start the recoloring procedure. This attribute may be reconfigured between independent batches of episodes.
Property action_mask This abstract property must be implemented by any concrete subclass. It must return None if no episodes are currently being run in parallel, or if every action is available in every current state. Otherwise, it must return a two-dimensional ...
Property action_number This abstract property must be implemented by any concrete subclass. It must return the total number of distinct actions that can be executed in the environment, as a positive int.
Property episode_length This abstract property must be implemented by any concrete subclass. It must return the predetermined common length of all episodes run in parallel, i.e., the total number of actions executed in each episode, as a positive ...
Property is_continuing This abstract property must be implemented by any concrete subclass. It must return a bool indicating whether the environment is continuing (True) or episodic (False).
Property state_dtype This abstract property must be implemented by any concrete subclass. It must return the data type of the one-dimensional numpy.ndarray vectors that represent states, as a numpy.dtype.
Property state_length This abstract property must be implemented by any concrete subclass. It must return the number of entries in each state vector, i.e., the length of the one-dimensional numpy.ndarray vectors that represent states, as a positive ...
Method _initialize_batch This abstract method must be implemented by any concrete subclass. It must initialize a batch of episodes of the specified size and update the _state_batch and _status attributes so that they represent the newly initialized batch.
Method _transition_batch This method applies a batch of actions to the current batch of states and updates the _state_batch and _status attributes to reflect the resulting states and the updated batch status.
Instance Variable _allow_loops A bool indicating whether loops are allowed in the graphs to be constructed.
Instance Variable _current_vertices Either None or a numpy.ndarray of type numpy.int32 specifying the vertex where the agent is currently located in each episode run in parallel.
Instance Variable _edge_colors The number of proper edge colors in the graphs to be constructed, given as a positive int that is at least 2.
Instance Variable _episode_length A positive int specifying the episode length, i.e., the total number of actions in each episode.
Instance Variable _flattened_length A positive int equal to the flattened length of the graphs to be constructed.
Instance Variable _flattened_ordering An item of the FlattenedOrdering enumeration specifying the edge (resp. arc) ordering (row-major or clockwise).
Instance Variable _graph_order A positive int that describes the order of the graphs to be constructed.
Instance Variable _is_directed A bool indicating whether the graphs to be constructed are directed or undirected.
Instance Variable _state_batch See the description of the GraphEnvironment._state_batch attribute.
Instance Variable _state_length A positive int that determines the length of each of the state vectors, i.e., the number (_edge_colors - 1) * _flattened_length + _graph_order.
Instance Variable _status See the description of the GraphEnvironment._status attribute.
Instance Variable _step_count Either None or a nonnegative int counting how many actions have been executed in the current batch of episodes. When _step_count equals _episode_length, the episode has reached a final state. This attribute is updated after each call to ...

Inherited from GraphEnvironment:

Method reset_batch This method initializes a batch of episodes of a specified size and returns the resulting batch of states, the corresponding values of the selected graph invariant (if computed), and the status of the batch of episodes...
Method state_to_graph This method extracts the underlying graph corresponding to a single state.
Method step_batch This method applies a batch of actions to the current batch of episodes and returns the resulting batch of states, the corresponding values of the selected graph invariant (if computed), and the updated status of the batch...
Instance Variable sparse_setting A bool indicating whether the graph invariant values should be computed only for the final batch of actions.
Instance Variable __graph_batch Either None or a Graph object representing the current batch of underlying graphs. This attribute is updated only when required by the sparse setting.
Instance Variable __graph_invariant A GraphInvariant function specifying the graph invariant to be maximized.
Instance Variable __graph_invariant_batch Either None or a one-dimensional numpy.ndarray of type numpy.float32 containing the current batch of graph invariant values. As with __graph_batch, this attribute is updated only when required by the sparse setting.
Instance Variable __graph_invariant_diff Either None, indicating that graph invariant values are always computed directly using __graph_invariant, or a GraphInvariantDiff function used to incrementally update invariant values after state transitions.
def __init__(self, graph_invariant: GraphInvariant, graph_order: int, episode_length: int | None = None, flattened_ordering: FlattenedOrdering = FlattenedOrdering.ROW_MAJOR, edge_colors: int = 2, is_directed: bool = False, allow_loops: bool = False, initial_graph_generator: GraphGenerator | None = None, starting_vertex: int = 0, graph_invariant_diff: GraphInvariantDiff | None = None, sparse_setting: bool = False):

This constructor initializes an instance of the LocalSetEnvironment class.

Parameters
graph_invariant:GraphInvariantA GraphInvariant function that computes the graph invariant values associated with a batch of underlying graphs. These values are the quantities to be maximized by the environment.
graph_order:intA positive int (not below 2) that represents the graph order of the graphs to be constructed.
episode_length:int | NoneEither None, or a positive int specifying the number of actions in each episode. If None, the episode length defaults to the flattened length of the graphs to be constructed. The default value is None.
flattened_ordering:FlattenedOrderingAn item of the FlattenedOrdering enumeration specifying whether the edges (resp. arcs) are ordered row-major or clockwise. The default value is FlattenedOrdering.ROW_MAJOR.
edge_colors:intA positive int (not below 2) specifying the number of proper edge colors in the graphs to be constructed. The default value is 2.
is_directed:boolA bool indicating whether the graphs to be constructed are directed. The default value is False.
allow_loops:boolA bool indicating whether loops are allowed in the graphs to be constructed. The default value is False.
initial_graph_generator:GraphGenerator | NoneEither None or a GraphGenerator function that determines how the initial fully colored graphs are generated for the batch of initial states. If None, all edges (resp. arcs) in all graphs are initially colored with color 0. The default value is None.
starting_vertex:intA nonnegative int strictly less than graph_order specifying the vertex at which the agent starts the recoloring procedure. The default value is 0.
graph_invariant_diff:GraphInvariantDiff | NoneEither None, indicating that graph invariant values are always computed directly using graph_invariant, or a GraphInvariantDiff function that computes element-wise differences of the graph invariant values when the environment transitions from one batch of underlying graphs to another. The default value is None.
sparse_setting:boolA bool indicating whether the sparse setting is enabled. If set to True, the graph invariant values are computed only for the final batch of actions. Otherwise, the graph invariant values are computed after every batch of actions. The default value is False.
def episode_length(self, episode_length: int):

This setter allows the user to potentially reconfigure the episode length between two independent batches of episodes. It should not be used while a batch of episodes is currently in progress.

Parameters
episode_length:intA positive int specifying the new episode length.
def state_batch_to_graph_batch(self, state_batch: np.ndarray) -> Graph:

This abstract method must be implemented by any concrete subclass. It extracts the batch of underlying graphs corresponding to a provided batch of states. Implementations must return a Graph object containing the graphs corresponding to each row in state_batch, preserving the row order. This method must be pure and must not modify any attributes of the class instance.

Parameters
state_batch:np.ndarrayA two-dimensional numpy.ndarray whose rows represent individual states from which the underlying graphs are to be extracted.
Returns
GraphA Graph object representing the extracted batch of graphs.
initial_graph_generator: GraphGenerator =

A GraphGenerator function that defines how the underlying fully colored graphs are generated for the initial states. This attribute may be reconfigured between independent batches of episodes.

starting_vertex: int =

A nonnegative int below the configured graph order that determines the vertex at which the agent should start the recoloring procedure. This attribute may be reconfigured between independent batches of episodes.

action_mask: np.ndarray | None =

This abstract property must be implemented by any concrete subclass. It must return None if no episodes are currently being run in parallel, or if every action is available in every current state. Otherwise, it must return a two-dimensional numpy.ndarray matrix a of type bool whose entry a[i, j] is True if and only if action j is available in the current state of the i-th episode.

action_number: int =

This abstract property must be implemented by any concrete subclass. It must return the total number of distinct actions that can be executed in the environment, as a positive int.

episode_length: int =

This abstract property must be implemented by any concrete subclass. It must return the predetermined common length of all episodes run in parallel, i.e., the total number of actions executed in each episode, as a positive int.

is_continuing: bool =

This abstract property must be implemented by any concrete subclass. It must return a bool indicating whether the environment is continuing (True) or episodic (False).

state_dtype: np.dtype =

This abstract property must be implemented by any concrete subclass. It must return the data type of the one-dimensional numpy.ndarray vectors that represent states, as a numpy.dtype.

state_length: int =

This abstract property must be implemented by any concrete subclass. It must return the number of entries in each state vector, i.e., the length of the one-dimensional numpy.ndarray vectors that represent states, as a positive int.

def _initialize_batch(self, batch_size: int):

This abstract method must be implemented by any concrete subclass. It must initialize a batch of episodes of the specified size and update the _state_batch and _status attributes so that they represent the newly initialized batch.

Parameters
batch_size:intThe number of episodes to initialize in the batch, given as a positive int.
def _transition_batch(self, action_batch: np.ndarray):

This method applies a batch of actions to the current batch of states and updates the _state_batch and _status attributes to reflect the resulting states and the updated batch status.

Parameters
action_batch:np.ndarrayA one-dimensional numpy.ndarray of type numpy.int32 containing the actions to be applied. The length of action_batch must match the number of states in _state_batch.
Note
If loops are not allowed and an action attempts to traverse a loop, a RuntimeError is raised.
_allow_loops: bool =

A bool indicating whether loops are allowed in the graphs to be constructed.

_current_vertices: np.ndarray | None =

Either None or a numpy.ndarray of type numpy.int32 specifying the vertex where the agent is currently located in each episode run in parallel.

_edge_colors: int =

The number of proper edge colors in the graphs to be constructed, given as a positive int that is at least 2.

_episode_length: int =

A positive int specifying the episode length, i.e., the total number of actions in each episode.

_flattened_length: int =

A positive int equal to the flattened length of the graphs to be constructed.

_flattened_ordering: FlattenedOrdering =

An item of the FlattenedOrdering enumeration specifying the edge (resp. arc) ordering (row-major or clockwise).

_graph_order: int =

A positive int that describes the order of the graphs to be constructed.

_is_directed: bool =

A bool indicating whether the graphs to be constructed are directed or undirected.

_state_length: int =

A positive int that determines the length of each of the state vectors, i.e., the number (_edge_colors - 1) * _flattened_length + _graph_order.

_step_count: int | None =

Either None or a nonnegative int counting how many actions have been executed in the current batch of episodes. When _step_count equals _episode_length, the episode has reached a final state. This attribute is updated after each call to GraphEnvironment.reset_batch or GraphEnvironment.step_batch.