Model Components

This page documents the implemented neural network components. These are intended as building blocks for the agent model, but not to be used as standalone models (should probably disambiguate the name from model).

Complete models which actually function as the agent model have additional functionality in the forward() method for handling of leading dimensions of inputs/outputs. See infer_leading_dims() and restore_leading_dims until utilities, and see the documentation for each algorithm for associated complete models.

class rlpyt.models.mlp.MlpModel(input_size, hidden_sizes, output_size=None, nonlinearity=<sphinx.ext.autodoc.importer._MockObject object>)

Bases: sphinx.ext.autodoc.importer._MockObject

Multilayer Perceptron with last layer linear.

Parameters:
  • input_size (int) – number of inputs
  • hidden_sizes (list) – can be empty list for none (linear model).
  • output_size – linear layer at output, or if None, the last hidden size will be the output size and will have nonlinearity applied
  • nonlinearity – torch nonlinearity Module (not Functional).
forward(input)

Compute the model on the input, assuming input shape [B,input_size].

output_size

Retuns the output size of the model.

class rlpyt.models.conv2d.Conv2dModel(in_channels, channels, kernel_sizes, strides, paddings=None, nonlinearity=<sphinx.ext.autodoc.importer._MockObject object>, use_maxpool=False, head_sizes=None)

Bases: sphinx.ext.autodoc.importer._MockObject

2-D Convolutional model component, with option for max-pooling vs downsampling for strides > 1. Requires number of input channels, but not input shape. Uses torch.nn.Conv2d.

forward(input)

Computes the convolution stack on the input; assumes correct shape already: [B,C,H,W].

conv_out_size(h, w, c=None)

Helper function ot return the output size for a given input shape, without actually performing a forward pass through the model.

class rlpyt.models.conv2d.Conv2dHeadModel(image_shape, channels, kernel_sizes, strides, hidden_sizes, output_size=None, paddings=None, nonlinearity=<sphinx.ext.autodoc.importer._MockObject object>, use_maxpool=False)

Bases: sphinx.ext.autodoc.importer._MockObject

Model component composed of a Conv2dModel component followed by a fully-connected MlpModel head. Requires full input image shape to instantiate the MLP head.

forward(input)

Compute the convolution and fully connected head on the input; assumes correct input shape: [B,C,H,W].

output_size

Returns the final output size after MLP head.

Utilities

rlpyt.models.utils.conv2d_output_shape(h, w, kernel_size=1, stride=1, padding=0, dilation=1)

Returns output H, W after convolution/pooling on input H, W.

class rlpyt.models.utils.ScaleGrad(*args, **kwargs)

Model component to scale gradients back from layer, without affecting the forward pass. Used e.g. in dueling heads DQN models.

static forward(ctx, tensor, scale)

Stores the scale input to ctx for application in backward(); simply returns the input tensor.

static backward(ctx, grad_output)

Return the grad_output multiplied by ctx.scale. Also returns a None as placeholder corresponding to (non-existent) gradient of the input scale of forward().

rlpyt.models.utils.update_state_dict(model, state_dict, tau=1, strip_ddp=True)

Update the state dict of model using the input state_dict, which must match format. tau==1 applies hard update, copying the values, 0<tau<1 applies soft update: tau * new + (1 - tau) * old.

rlpyt.models.utils.strip_ddp_state_dict(state_dict)

Workaround the fact that DistributedDataParallel prepends ‘module.’ to every key, but the sampler models will not be wrapped in DistributedDataParallel. (Solution from PyTorch forums.)