Skip to content

Commit

Permalink
Maintenance: use pre-commit test instead of azure for format (#279)
Browse files Browse the repository at this point in the history
* update pre-commit

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix pre-commit

* fix pre-commit 2

* fix pre-commit 3

* fix pre-commit 4

* fix pre-commit 4.5

* fix format

* trigger ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
  • Loading branch information
TimotheeMathieu and pre-commit-ci[bot] authored Feb 28, 2023
1 parent 3132350 commit 98e19e9
Show file tree
Hide file tree
Showing 69 changed files with 397 additions and 438 deletions.
11 changes: 6 additions & 5 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,20 +10,21 @@ repos:
- id: check-docstring-first

- repo: https://github.com/psf/black
rev: 22.3.0
rev: 23.1.0
hooks:
- id: black

- repo: https://github.com/asottile/blacken-docs
rev: v1.12.0
rev: 1.13.0
hooks:
- id: blacken-docs
additional_dependencies: [black==21.12b0]
additional_dependencies: [black==23.1.0]

- repo: https://gitlab.com/pycqa/flake8
rev: 4.0.1
- repo: https://github.com/pycqa/flake8
rev: 6.0.0
hooks:
- id: flake8
additional_dependencies: [flake8-docstrings]
types: [file, python]
exclude: (.*/__init__.py|rlberry/check_packages.py)
args: ['--select=F401,F405,D410,D411,D412']
2 changes: 1 addition & 1 deletion CITATION.cff
Original file line number Diff line number Diff line change
Expand Up @@ -19,4 +19,4 @@ abbreviation: rlberry
version: 0.2.2-dev
doi: 10.5281/zenodo.5223307
date-released: 2021-10-01
url: "https://github.com/rlberry-py/rlberry"
url: "https://github.com/rlberry-py/rlberry"
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
A Reinforcement Learning Library for Research and Education
</p>


<!-- The badges -->
<p align="center">
<a href="https://github.com/rlberry-py/rlberry/workflows/test/badge.svg">
Expand Down
2 changes: 1 addition & 1 deletion assets/logo_square.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion assets/logo_wide.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
25 changes: 0 additions & 25 deletions azure-pipelines.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,6 @@ pr:
- rlberry/_version.py
- docs



jobs:

- job: 'checkPrLabel'
Expand Down Expand Up @@ -74,29 +72,6 @@ jobs:
./codecov
displayName: 'Upload to codecov.io'
- job: 'Formatting'
dependsOn: checkPrLabel
condition: or(in(variables['Build.SourceBranch'], 'refs/heads/main'), eq(dependencies.checkPrLabel.outputs['checkPrLabel.prHasCILabel'], true))

pool:
vmImage: ubuntu-latest
strategy:
matrix:
Python39:
python.version: '3.9'

steps:
- script: |
python -m pip install --upgrade pip
pip install black flake8 flake8-docstrings
black --check examples rlberry *py
displayName: "black"
- script: |
# ensure there is no unused imports with
flake8 --select F401,F405,D410,D411,D412 --exclude=rlberry/check_packages.py --per-file-ignores="__init__.py:F401"
displayName: 'flake8'
- job: 'macOS'
dependsOn: checkPrLabel
Expand Down
2 changes: 1 addition & 1 deletion docs/basics/DeepRLTutorial/TutorialDeepRL.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ Quickstart for Deep Reinforcement Learning in rlberry
=====================================================

.. highlight:: none

..
Authors: Riccardo Della Vecchia, Hector Kohler, Alena Shilova.
Expand Down
38 changes: 20 additions & 18 deletions docs/basics/compare_agents.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ Compare different agents
========================


Two or more agents can be compared using the classes
Two or more agents can be compared using the classes
:class:`~rlberry.manager.agent_manager.AgentManager` and
:class:`~rlberry.manager.multiple_managers.MultipleManagers`, as in the example below.

Expand All @@ -26,44 +26,46 @@ Two or more agents can be compared using the classes
# Parameters
params = {}
params['reinforce'] = dict(
gamma=0.99,
horizon=160,
params["reinforce"] = dict(
gamma=0.99,
horizon=160,
)
params['kernel'] = dict(
gamma=0.99,
horizon=160,
params["kernel"] = dict(
gamma=0.99,
horizon=160,
)
eval_kwargs = dict(eval_horizon=200)
# Create AgentManager for REINFORCE and RSKernelUCBVI
multimanagers = MultipleManagers()
multimanagers.append(
AgentManager(
AgentManager(
REINFORCEAgent,
env,
init_kwargs=params['reinforce'],
init_kwargs=params["reinforce"],
fit_budget=100,
n_fit=4,
parallelization='thread')
parallelization="thread",
)
)
multimanagers.append(
AgentManager(
AgentManager(
RSKernelUCBVIAgent,
env,
init_kwargs=params['kernel'],
init_kwargs=params["kernel"],
fit_budget=100,
n_fit=4,
parallelization='thread')
parallelization="thread",
)
)
# Fit and plot
multimanagers.run()
plot_writer_data(
multimanagers.managers,
tag='episode_rewards',
preprocess_func=np.cumsum,
title="Cumulative Rewards")
multimanagers.managers,
tag="episode_rewards",
preprocess_func=np.cumsum,
title="Cumulative Rewards",
)
40 changes: 19 additions & 21 deletions docs/basics/create_agent.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,35 +9,33 @@ Create an agent
rlberry_ requires you to use a **very simple interface** to write agents, with basically
two methods to implement: :code:`fit()` and :code:`eval()`.

The example below shows how to create an agent.
The example below shows how to create an agent.


.. code-block:: python
import numpy as np
from rlberry.agents import Agent
class MyAgent(Agent):
class MyAgent(Agent):
name = "MyAgent"
def __init__(self,
env,
param1=0.99,
param2=1e-5,
**kwargs): # it's important to put **kwargs to ensure compatibility with the base class
# self.env is initialized in the base class
# An evaluation environment is also initialized: self.eval_env
Agent.__init__(self, env, **kwargs)
def __init__(
self, env, param1=0.99, param2=1e-5, **kwargs
): # it's important to put **kwargs to ensure compatibility with the base class
# self.env is initialized in the base class
# An evaluation environment is also initialized: self.eval_env
Agent.__init__(self, env, **kwargs)
self.param1 = param1
self.param2 = param2
self.param1 = param1
self.param2 = param2
def fit(self, budget, **kwargs):
def fit(self, budget, **kwargs):
"""
The parameter budget can represent the number of steps, the number of episodes etc,
depending on the agent.
* Interact with the environment (self.env);
* Interact with the environment (self.env);
* Train the agent
* Return useful information
"""
Expand All @@ -48,26 +46,26 @@ The example below shows how to create an agent.
state = self.env.reset()
done = False
while not done:
action = ...
next_state, reward, done, _ = self.env.step(action)
rewards[ep] += reward
action = ...
next_state, reward, done, _ = self.env.step(action)
rewards[ep] += reward
info = {'episode_rewards': rewards}
info = {"episode_rewards": rewards}
return info
def eval(self, **kwargs):
"""
Returns a value corresponding to the evaluation of the agent on the
Returns a value corresponding to the evaluation of the agent on the
evaluation environment.
For instance, it can be a Monte-Carlo evaluation of the policy learned in fit().
"""
return 0.0
.. note:: It's important that your agent accepts optional `**kwargs` and pass it to the base class as :code:`Agent.__init__(self, env, **kwargs)`.
.. note:: It's important that your agent accepts optional `**kwargs` and pass it to the base class as :code:`Agent.__init__(self, env, **kwargs)`.


.. seealso::
Documentation of the classes :class:`~rlberry.agents.agent.Agent`
Documentation of the classes :class:`~rlberry.agents.agent.Agent`
and :class:`~rlberry.agents.agent.AgentWithSimplePolicy`.
31 changes: 16 additions & 15 deletions docs/basics/evaluate_agent.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ as shown in the examples below.
# Environment (constructor, kwargs)
env = (gym_make, dict(id='CartPole-v1'))
env = (gym_make, dict(id="CartPole-v1"))
# Initial set of parameters
params = dict(
Expand All @@ -41,14 +41,15 @@ as shown in the examples below.
eval_kwargs=eval_kwargs,
fit_budget=fit_budget,
n_fit=4,
parallelization='thread')
parallelization="thread",
)
# Fit the 4 instances
stats.fit()
# The fit() method of REINFORCEAgent logs data to a :class:`~rlberry.utils.writers.DefaultWriter`
# object. The method below can be used to plot those data!
plot_writer_data(stats, tag='episode_rewards')
plot_writer_data(stats, tag="episode_rewards")
Expand All @@ -72,17 +73,17 @@ For :class:`~rlberry.agents.reinforce.reinforce.REINFORCEAgent`, this method loo
----------
trial: optuna.trial
"""
batch_size = trial.suggest_categorical('batch_size', [1, 4, 8, 16, 32])
gamma = trial.suggest_categorical('gamma', [0.9, 0.95, 0.99])
learning_rate = trial.suggest_loguniform('learning_rate', 1e-5, 1)
entr_coef = trial.suggest_loguniform('entr_coef', 1e-8, 0.1)
batch_size = trial.suggest_categorical("batch_size", [1, 4, 8, 16, 32])
gamma = trial.suggest_categorical("gamma", [0.9, 0.95, 0.99])
learning_rate = trial.suggest_loguniform("learning_rate", 1e-5, 1)
entr_coef = trial.suggest_loguniform("entr_coef", 1e-8, 0.1)
return {
'batch_size': batch_size,
'gamma': gamma,
'learning_rate': learning_rate,
'entr_coef': entr_coef,
}
"batch_size": batch_size,
"gamma": gamma,
"learning_rate": learning_rate,
"entr_coef": entr_coef,
}
Now we can use the :meth:`optimize_hyperparams` method
Expand All @@ -93,13 +94,13 @@ of :class:`~rlberry.manager.agent_manager.AgentManager` to find good parameters
# Run optimization and print results
stats.optimize_hyperparams(
n_trials=100,
timeout=10, # stop after 10 seconds
timeout=10, # stop after 10 seconds
n_fit=2,
sampler_method='optuna_default'
sampler_method="optuna_default",
)
print(stats.best_hyperparams)
# Calling fit() again will train the agent with the optimized parameters
stats.fit()
plot_writer_data(stats, tag='episode_rewards')
plot_writer_data(stats, tag="episode_rewards")
4 changes: 2 additions & 2 deletions docs/basics/quick_start_rl/quickstart.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
.. _quick_start:

.. highlight:: none

Quick Start for Reinforcement Learning in rlberry
=================================================

Expand Down Expand Up @@ -60,7 +60,7 @@ Let us see a graphical representation
.. parsed-literal::
ffmpeg version n5.0 Copyright (c) 2000-2022 the FFmpeg developers
built with gcc 11.2.0 (GCC)
configuration: --prefix=/usr --disable-debug --disable-static --disable-stripping --enable-amf --enable-avisynth --enable-cuda-llvm --enable-lto --enable-fontconfig --enable-gmp --enable-gnutls --enable-gpl --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libdav1d --enable-libdrm --enable-libfreetype --enable-libfribidi --enable-libgsm --enable-libiec61883 --enable-libjack --enable-libmfx --enable-libmodplug --enable-libmp3lame --enable-libopencore_amrnb --enable-libopencore_amrwb --enable-libopenjpeg --enable-libopus --enable-libpulse --enable-librav1e --enable-librsvg --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libsvtav1 --enable-libtheora --enable-libv4l2 --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxcb --enable-libxml2 --enable-libxvid --enable-libzimg --enable-nvdec --enable-nvenc --enable-shared --enable-version3
Expand Down
Loading

0 comments on commit 98e19e9

Please sign in to comment.