Creating and Launching Experiments

Some utilities are included for creating and launching experiments comprised of multiple individual learning runs, e.g. for hyperparameter sweeps. To date, these include functions for launching locally on a machine, so launching into the cloud may require different tooling. Many experiments can be queued on a given hardware resource, and they will be cycled through to run in sequence (e.g. a desktop with 4 GPUs and each run getting exclusive use of 2 GPUs).

Launching

rlpyt.utils.launching.exp_launcher.run_experiments(script, affinity_code, experiment_title, runs_per_setting, variants, log_dirs, common_args=None, runs_args=None, set_egl_device=False)

Call in a script to run a set of experiments locally on a machine. Uses the launch_experiment() function for each individual run, which is a call to the script file. The number of experiments to run at the same time is determined from the affinity_code, which expresses the hardware resources of the machine and how much resource each run gets (e.g. 4 GPU machine, 2 GPUs per run). Experiments are queued and run in sequence, with the intention to avoid hardware overlap. Inputs variants and log_dirs should be lists of the same length, containing each experiment configuration and where to save its log files (which have the same name, so can’t exist in the same folder).

Hint

To monitor progress, view the num_launched.txt file and experiments_tree.txt file in the experiment root directory, and also check the length of each progress.csv file, e.g. wc -l experiment-directory/.../run_*/progress.csv.

rlpyt.utils.launching.exp_launcher.launch_experiment(script, run_slot, affinity_code, log_dir, variant, run_ID, args, python_executable=None, set_egl_device=False)

Launches one learning run using subprocess.Popen() to call the python script. Calls the script as: python {script} {slot_affinity_code} {log_dir} {run_ID} {*args}

If affinity_code["all_cpus"] is provided, then the call is prepended with tasket -c .. and the listed cpus (this is the most sure way to keep the run limited to these CPU cores). Also saves the variant file. Returns the process handle, which can be monitored.

Use set_egl_device=True to set an environment variable EGL_DEVICE_ID equal to the same value as the cuda index for the algorithm. For example, can use with DMControl environment modified to look for this environment variable when selecting a GPU for headless rendering.

Variants

Some simple tools are provided for creating hyperparameter value variants.

class rlpyt.utils.launching.variant.VariantLevel(keys, values, dir_names)

A namedtuple which describes a set of hyperparameter settings.

Input keys should be a list of tuples, where each tuple is the sequence of keys to navigate down the configuration dictionary to the value.

Input values should be a list of lists, where each element of the outer list is a complete set of values, and position in the inner list corresponds to the key at that position in the keys list, i.e. each combination must be explicitly written.

Input dir_names should have the same length as values, and includeunique paths for logging results from each value combination.

rlpyt.utils.launching.variant.make_variants(*variant_levels)

Takes in any number of VariantLevel objects and crosses them in order. Returns the resulting lists of full variant and log directories. Every set of values in one level is paired with every set of values in the next level, e.g. if two combinations are specified in one level and three combinations in the next, then six total configuations will result.

Use in the script to create and run a set of learning runs.

rlpyt.utils.launching.variant._cross_variants(prev_variants, prev_log_dirs, variant_level)

For every previous variant, make all combinations with new values.

rlpyt.utils.launching.variant.load_variant(log_dir)

Loads the variant.json file from the directory.

rlpyt.utils.launching.variant.save_variant(variant, log_dir)

Saves a variant.json file to the directory.

rlpyt.utils.launching.variant.update_config(default, variant)

Performs deep update on all dict structures from variant, updating only individual fields. Any field in variant must be present in default, else raises KeyError (helps prevent mistakes). Operates recursively to return a new dictionary.

Affinity

The hardware affinity is used for several purposes: 1) the experiment launcher uses it to determine how many concurrent experiments to run, 2) runners use it to determine GPU device selection, 3) parallel samplers use it to determine the number of worker processes, and 4) multi-GPU and asynchronous runners use it to determine the number of parallel processes. The main intent of the implemented utilities is to take as input the total amount of hardware resources in the computer (CPU & GPU) and the amount of resources to be dedicated to each job, and then to divide resources evenly.

Example

An 8-GPU, 40-CPU machine would have 5 CPU assigned to each GPU. 1 GPU per run would set up 8 concurrent experiments, with each sampler using the 5 CPU. 2 GPU per run with synchronous runner would set up 4 concurrent experiments.

rlpyt.utils.launching.affinity.encode_affinity(n_cpu_core=1, n_gpu=0, contexts_per_gpu=1, gpu_per_run=1, cpu_per_run=1, cpu_per_worker=1, cpu_reserved=0, hyperthread_offset=None, n_socket=None, run_slot=None, async_sample=False, sample_gpu_per_run=0, optim_sample_share_gpu=False, alternating=False, set_affinity=True)

Encodes the hardware configuration into a string (with meanings defined in this file) which can be passed as a command line argument to call the training script. Use in overall experiments setup script to specify computer and experiment resources into run_experiments().

We refer to an “experiment” as an individual learning run, i.e. one set of hyperparameters and which does not interact with other runs.

Parameters:
  • n_cpu_core (int) – Total number of phyical cores to use on machine (not virtual)
  • n_gpu (int) – Total number of GPUs to use on machine
  • contexts_per_gpu (int) – How many experiment to share each GPU
  • gpu_per_run (int) – How many GPUs to use per experiment (for multi-GPU optimization)
  • cpu_per_run (int) – If not using GPU, specify how macores per experiment
  • cpu_per_worker (int) – CPU cores per sampler worker; 1 unless environment is multi-threaded
  • cpu_reserved (int) – Number of CPUs to reserve per GPU, and not allow sampler to use them
  • hyperthread_offset (int) – Typically the number of physical cores, since they are labeled 0-x, and hyperthreads as (x+1)-2x; use 0 to disable hyperthreads, None to auto-detect
  • n_socket (int) – Number of CPU sockets in machine; tries to keep CPUs grouped on same socket, and match socket-to-GPU affinity
  • run_slot (int) – Which hardware slot to use; leave None into run_experiments(), but specified for inidividual train script
  • async_sample (bool) – True if asynchronous sampling/optimization mode; different affinity structure needed
  • sample_gpu_per_run (int) – In asynchronous mode only, number of action-server GPUs per experiment
  • optim_sample_share_gpu (bool) – In asynchronous mode only, whether to use same GPU(s) for both training and sampling
  • alternating (bool) – True if using alternating sampler (will make more worker assignments)
  • set_affinity (bool) – False to disable runner and sampler from setting cpu affinity via psutil, maybe inappropriate in cloud machines.
rlpyt.utils.launching.affinity.encode_affinity(n_cpu_core=1, n_gpu=0, contexts_per_gpu=1, gpu_per_run=1, cpu_per_run=1, cpu_per_worker=1, cpu_reserved=0, hyperthread_offset=None, n_socket=None, run_slot=None, async_sample=False, sample_gpu_per_run=0, optim_sample_share_gpu=False, alternating=False, set_affinity=True)

Encodes the hardware configuration into a string (with meanings defined in this file) which can be passed as a command line argument to call the training script. Use in overall experiments setup script to specify computer and experiment resources into run_experiments().

We refer to an “experiment” as an individual learning run, i.e. one set of hyperparameters and which does not interact with other runs.

Parameters:
  • n_cpu_core (int) – Total number of phyical cores to use on machine (not virtual)
  • n_gpu (int) – Total number of GPUs to use on machine
  • contexts_per_gpu (int) – How many experiment to share each GPU
  • gpu_per_run (int) – How many GPUs to use per experiment (for multi-GPU optimization)
  • cpu_per_run (int) – If not using GPU, specify how macores per experiment
  • cpu_per_worker (int) – CPU cores per sampler worker; 1 unless environment is multi-threaded
  • cpu_reserved (int) – Number of CPUs to reserve per GPU, and not allow sampler to use them
  • hyperthread_offset (int) – Typically the number of physical cores, since they are labeled 0-x, and hyperthreads as (x+1)-2x; use 0 to disable hyperthreads, None to auto-detect
  • n_socket (int) – Number of CPU sockets in machine; tries to keep CPUs grouped on same socket, and match socket-to-GPU affinity
  • run_slot (int) – Which hardware slot to use; leave None into run_experiments(), but specified for inidividual train script
  • async_sample (bool) – True if asynchronous sampling/optimization mode; different affinity structure needed
  • sample_gpu_per_run (int) – In asynchronous mode only, number of action-server GPUs per experiment
  • optim_sample_share_gpu (bool) – In asynchronous mode only, whether to use same GPU(s) for both training and sampling
  • alternating (bool) – True if using alternating sampler (will make more worker assignments)
  • set_affinity (bool) – False to disable runner and sampler from setting cpu affinity via psutil, maybe inappropriate in cloud machines.
rlpyt.utils.launching.affinity.make_affinity(run_slot=0, **kwargs)

Input same kwargs as encode_affinity(), returns the AttrDict form.

rlpyt.utils.launching.affinity.affinity_from_code(run_slot_affinity_code)

Use in individual experiment script; pass output to Runner.