Skip to content

Releases: biological-alignment-benchmarks/biological-alignment-gridagents-benchmarks

v0.9.4.1

16 Feb 16:20
1e7ac32

Choose a tag to compare

Adding gridworlds import package versions since gridworlds itself cannot specify package versions in order to avoid version conflicts in dependant code

v0.9.4.0

14 Feb 18:03
620dd21

Choose a tag to compare

  • Save "pip list" to log folder.
  • When archiving ai_safety_gridworlds and zoo_to_gym_multiagent_adapter code, check different locations since the location depends on setup.
  • Add more file types to the archive.
  • Save benchmarking code version, gridworlds code version and gpu model in the output jsonl.
  • Adjustments to package versions.

v0.9.3.8

17 Jan 08:46
032b7a7

Choose a tag to compare

Various smaller adjustments and improvements
Adding timestamps to console output
Adding Zenodo association
Adding support for imitation learning / expert override with PPO multi-agent weight sharing setup.
Adding configuration files for imitation learning. Also adding configs for 2-layout trials.
Adding support for use_expln PPO and A2C policy argument to mitigate NaNs in SB3 tensors.
Adding support for SB3 AdamW optimizer, needed to mitigate occasional PPO NaN's during imitation learning.
Adding support for target_kl PPO config argument to mitigate NaNs in SB3 tensors.
Adding support for early_detect_nans config argument to early detect NaNs in SB3 tensors.
Adding support for soft_stop_training_on_nan_errors to handle NaNs in SB3 tensors smoothly.
Adding retry_training_on_nans_for_n_times and skip_test_on_training_stop_on_nan_errors config settings.
Adding new fields to the jsonl log: num_actual_train_episodes, training_run_was_terminated_early_due_to_nans, num_training_retries_used, test_checkpoint_filenames.

v0.9.3.6

09 Jan 16:44

Choose a tag to compare

Various smaller adjustments and improvements
Adding Zenodo association
Adding support for imitation learning / expert override with PPO multi-agent weight sharing setup.
Adding configuration files for imitation learning. Also adding configs for 2-layout trials.
Adding support for use_expln PPO and A2C policy argument to mitigate NaNs in SB3 tensors.
Adding support for SB3 AdamW optimizer, needed to mitigate occasional PPO NaN's during imitation learning.
Adding support for target_kl PPO config argument to mitigate NaNs in SB3 tensors.
Adding support for early_detect_nans config argument to early detect NaNs in SB3 tensors.
Adding support for soft_stop_training_on_nan_errors to handle NaNs in SB3 tensors smoothly.
Adding retry_training_on_nans_for_n_times and skip_test_on_training_stop_on_nan_errors config settings.
Adding new fields to the jsonl log: num_actual_train_episodes, training_run_was_terminated_early_due_to_nans, num_training_retries_used, test_checkpoint_filenames.

v0.9.3.5

01 Jan 22:59

Choose a tag to compare

Adding Zenodo association
Adding support for imitation learning / expert override with PPO multi-agent weight sharing setup.
Adding configuration files for imitation learning. Also adding configs for 2-layout trials.
Adding support for use_expln PPO and A2C policy argument to mitigate NaNs in SB3 tensors.
Adding support for SB3 AdamW optimizer, needed to mitigate occasional PPO NaN's during imitation learning.
Adding support for target_kl PPO config argument to mitigate NaNs in SB3 tensors.
Adding support for early_detect_nans config argument to early detect NaNs in SB3 tensors.
Adding support for soft_stop_training_on_nan_errors to handle NaNs in SB3 tensors smoothly.
Adding retry_training_on_nans_for_n_times and skip_test_on_training_stop_on_nan_errors config settings.
Adding new fields to the jsonl log: num_actual_train_episodes, training_run_was_terminated_early_due_to_nans, num_training_retries_used, test_checkpoint_filenames.

v0.9.3.3

17 Dec 00:06

Choose a tag to compare

Adding support for imitation learning / expert override with PPO multi-agent weight sharing setup.
Adding support for use_expln PPO and A2C policy argument to mitigate NaNs in SB3 tensors.
Adding support for SB3 AdamW optimizer, needed to mitigate occasional PPO NaN's during imitation learning.
Adding support for target_kl PPO config argument to mitigate NaNs in SB3 tensors.
Adding support for early_detect_nans config argument to early detect NaNs in SB3 tensors.
Adding support for soft_stop_training_on_nan_errors to handle NaNs in SB3 tensors smoothly.
Adding configuration files for imitation learning. Also adding configs for 2-layout trials.

v0.9.3.2

13 Dec 04:57

Choose a tag to compare

  • Adding support for imitation learning / expert override with PPO multi-agent weight sharing setup.
  • Adding support for use_expln PPO and A2C policy argument to mitigate NaNs in SB3 tensors.
  • Adding support for SB3 AdamW optimizer, needed to mitigate occasional PPO NaN's during imitation learning.
  • Adding configuration files for imitation learning. Also adding configs for 2-layout trials.

v0.9.2

10 Dec 00:53

Choose a tag to compare

Updating publication title and authorship