Maintenance: use pre-commit test instead of azure for format (#279)

* update pre-commit * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix pre-commit * fix pre-commit 2 * fix pre-commit 3 * fix pre-commit 4 * fix pre-commit 4.5 * fix format * trigger ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
rlberry-py · Feb 28, 2023 · 98e19e9 · 98e19e9
1 parent 3132350
commit 98e19e9
Show file tree

Hide file tree

Showing 69 changed files with 397 additions and 438 deletions.
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -10,20 +10,21 @@ repos:
     -   id: check-docstring-first
 
 -   repo: https://github.com/psf/black
-    rev: 22.3.0
+    rev: 23.1.0
     hooks:
     -   id: black
 
 -   repo: https://github.com/asottile/blacken-docs
-    rev: v1.12.0
+    rev: 1.13.0
     hooks:
     -   id: blacken-docs
-        additional_dependencies: [black==21.12b0]
+        additional_dependencies: [black==23.1.0]
 
--   repo: https://gitlab.com/pycqa/flake8
-    rev: 4.0.1
+-   repo: https://github.com/pycqa/flake8
+    rev: 6.0.0
     hooks:
     -   id: flake8
         additional_dependencies: [flake8-docstrings]
         types: [file, python]
+        exclude: (.*/__init__.py|rlberry/check_packages.py)
         args: ['--select=F401,F405,D410,D411,D412']
diff --git a/CITATION.cff b/CITATION.cff
@@ -19,4 +19,4 @@ abbreviation: rlberry
 version: 0.2.2-dev
 doi: 10.5281/zenodo.5223307
 date-released: 2021-10-01
-url: "https://github.com/rlberry-py/rlberry"
+url: "https://github.com/rlberry-py/rlberry"
diff --git a/README.md b/README.md
@@ -9,6 +9,7 @@
    A Reinforcement Learning Library for Research and Education
 </p>
 
+
 <!-- The badges -->
 <p align="center">
    <a href="https://github.com/rlberry-py/rlberry/workflows/test/badge.svg">

diff --git a/assets/logo_square.svg b/assets/logo_square.svg
diff --git a/assets/logo_wide.svg b/assets/logo_wide.svg
diff --git a/azure-pipelines.yml b/azure-pipelines.yml
@@ -10,8 +10,6 @@ pr:
     - rlberry/_version.py
     - docs
 
-
-
 jobs:
 
 - job: 'checkPrLabel'
@@ -74,29 +72,6 @@ jobs:
       ./codecov
     displayName: 'Upload to codecov.io'
 
-- job: 'Formatting'
-  dependsOn: checkPrLabel
-  condition: or(in(variables['Build.SourceBranch'], 'refs/heads/main'), eq(dependencies.checkPrLabel.outputs['checkPrLabel.prHasCILabel'], true))
-
-  pool:
-    vmImage: ubuntu-latest
-  strategy:
-    matrix:
-      Python39:
-        python.version: '3.9'
-
-  steps:
-  - script: |
-      python -m pip install --upgrade pip
-      pip install  black flake8 flake8-docstrings
-      black --check examples rlberry *py
-    displayName: "black"
-  - script: |
-      # ensure there is no unused imports with
-      flake8 --select F401,F405,D410,D411,D412 --exclude=rlberry/check_packages.py --per-file-ignores="__init__.py:F401"
-    displayName: 'flake8'
-
-
 
 - job: 'macOS'
   dependsOn: checkPrLabel

diff --git a/docs/basics/DeepRLTutorial/TutorialDeepRL.rst b/docs/basics/DeepRLTutorial/TutorialDeepRL.rst
@@ -2,7 +2,7 @@ Quickstart for Deep Reinforcement Learning in rlberry
 =====================================================
 
 .. highlight:: none
-               
+
 ..
   Authors: Riccardo Della Vecchia, Hector Kohler, Alena Shilova.
 

diff --git a/docs/basics/compare_agents.rst b/docs/basics/compare_agents.rst
@@ -7,7 +7,7 @@ Compare different agents
 ========================
 
 
-Two or more agents can be compared using the classes 
+Two or more agents can be compared using the classes
 :class:`~rlberry.manager.agent_manager.AgentManager` and
 :class:`~rlberry.manager.multiple_managers.MultipleManagers`, as in the example below.
 
@@ -26,44 +26,46 @@ Two or more agents can be compared using the classes
 
         # Parameters
         params = {}
-        params['reinforce'] = dict(
-        gamma=0.99,
-        horizon=160,
+        params["reinforce"] = dict(
+            gamma=0.99,
+            horizon=160,
         )
 
-        params['kernel'] = dict(
-        gamma=0.99,
-        horizon=160,
+        params["kernel"] = dict(
+            gamma=0.99,
+            horizon=160,
         )
 
         eval_kwargs = dict(eval_horizon=200)
 
         # Create AgentManager for REINFORCE and RSKernelUCBVI
         multimanagers = MultipleManagers()
         multimanagers.append(
-        AgentManager(
+            AgentManager(
                 REINFORCEAgent,
                 env,
-                init_kwargs=params['reinforce'],
+                init_kwargs=params["reinforce"],
                 fit_budget=100,
                 n_fit=4,
-                parallelization='thread')
+                parallelization="thread",
+            )
         )
         multimanagers.append(
-        AgentManager(
+            AgentManager(
                 RSKernelUCBVIAgent,
                 env,
-                init_kwargs=params['kernel'],
+                init_kwargs=params["kernel"],
                 fit_budget=100,
                 n_fit=4,
-                parallelization='thread')
+                parallelization="thread",
+            )
         )
 
         # Fit and plot
         multimanagers.run()
         plot_writer_data(
-        multimanagers.managers,
-        tag='episode_rewards',
-        preprocess_func=np.cumsum,
-        title="Cumulative Rewards")
-
+            multimanagers.managers,
+            tag="episode_rewards",
+            preprocess_func=np.cumsum,
+            title="Cumulative Rewards",
+        )
diff --git a/docs/basics/create_agent.rst b/docs/basics/create_agent.rst
@@ -9,35 +9,33 @@ Create an agent
 rlberry_ requires you to use a **very simple interface** to write agents, with basically
 two methods to implement: :code:`fit()` and :code:`eval()`.
 
-The example below shows how to create an agent. 
+The example below shows how to create an agent.
 
 
 .. code-block:: python
 
     import numpy as np
     from rlberry.agents import Agent
 
-    class MyAgent(Agent):
 
+    class MyAgent(Agent):
         name = "MyAgent"
 
-        def __init__(self,
-                     env,
-                     param1=0.99,
-                     param2=1e-5,
-                     **kwargs):   # it's important to put **kwargs to ensure compatibility with the base class
-                # self.env is initialized in the base class
-                # An evaluation environment is also initialized: self.eval_env
-                Agent.__init__(self, env, **kwargs)
+        def __init__(
+            self, env, param1=0.99, param2=1e-5, **kwargs
+        ):  # it's important to put **kwargs to ensure compatibility with the base class
+            # self.env is initialized in the base class
+            # An evaluation environment is also initialized: self.eval_env
+            Agent.__init__(self, env, **kwargs)
 
-                self.param1 = param1
-                self.param2 = param2 
+            self.param1 = param1
+            self.param2 = param2
 
-        def fit(self, budget, **kwargs):  
+        def fit(self, budget, **kwargs):
             """
             The parameter budget can represent the number of steps, the number of episodes etc,
             depending on the agent.
-            * Interact with the environment (self.env); 
+            * Interact with the environment (self.env);
             * Train the agent
             * Return useful information
             """
@@ -48,26 +46,26 @@ The example below shows how to create an agent.
                 state = self.env.reset()
                 done = False
                 while not done:
-                action = ...  
-                next_state, reward, done, _ = self.env.step(action)
-                rewards[ep] += reward
+                    action = ...
+                    next_state, reward, done, _ = self.env.step(action)
+                    rewards[ep] += reward
 
-            info = {'episode_rewards': rewards}
+            info = {"episode_rewards": rewards}
             return info
 
         def eval(self, **kwargs):
             """
-            Returns a value corresponding to the evaluation of the agent on the 
+            Returns a value corresponding to the evaluation of the agent on the
             evaluation environment.
 
             For instance, it can be a Monte-Carlo evaluation of the policy learned in fit().
             """
             return 0.0
 
 
-.. note:: It's important that your agent accepts optional `**kwargs` and pass it to the base class as :code:`Agent.__init__(self, env, **kwargs)`. 
+.. note:: It's important that your agent accepts optional `**kwargs` and pass it to the base class as :code:`Agent.__init__(self, env, **kwargs)`.
 
 
 .. seealso::
-    Documentation of the classes :class:`~rlberry.agents.agent.Agent` 
+    Documentation of the classes :class:`~rlberry.agents.agent.Agent`
     and :class:`~rlberry.agents.agent.AgentWithSimplePolicy`.
diff --git a/docs/basics/evaluate_agent.rst b/docs/basics/evaluate_agent.rst
@@ -20,7 +20,7 @@ as shown in the examples below.
 
 
     # Environment (constructor, kwargs)
-    env = (gym_make, dict(id='CartPole-v1'))
+    env = (gym_make, dict(id="CartPole-v1"))
 
     # Initial set of parameters
     params = dict(
@@ -41,14 +41,15 @@ as shown in the examples below.
         eval_kwargs=eval_kwargs,
         fit_budget=fit_budget,
         n_fit=4,
-        parallelization='thread')
+        parallelization="thread",
+    )
 
     # Fit the 4 instances
     stats.fit()
 
     # The fit() method of REINFORCEAgent logs data to a :class:`~rlberry.utils.writers.DefaultWriter`
     # object. The method below can be used to plot those data!
-    plot_writer_data(stats, tag='episode_rewards')
+    plot_writer_data(stats, tag="episode_rewards")
 
 
 
@@ -72,17 +73,17 @@ For :class:`~rlberry.agents.reinforce.reinforce.REINFORCEAgent`, this method loo
         ----------
         trial: optuna.trial
         """
-        batch_size = trial.suggest_categorical('batch_size', [1, 4, 8, 16, 32])
-        gamma = trial.suggest_categorical('gamma', [0.9, 0.95, 0.99])
-        learning_rate = trial.suggest_loguniform('learning_rate', 1e-5, 1)
-        entr_coef = trial.suggest_loguniform('entr_coef', 1e-8, 0.1)
+        batch_size = trial.suggest_categorical("batch_size", [1, 4, 8, 16, 32])
+        gamma = trial.suggest_categorical("gamma", [0.9, 0.95, 0.99])
+        learning_rate = trial.suggest_loguniform("learning_rate", 1e-5, 1)
+        entr_coef = trial.suggest_loguniform("entr_coef", 1e-8, 0.1)
 
         return {
-                'batch_size': batch_size,
-                'gamma': gamma,
-                'learning_rate': learning_rate,
-                'entr_coef': entr_coef,
-                }
+            "batch_size": batch_size,
+            "gamma": gamma,
+            "learning_rate": learning_rate,
+            "entr_coef": entr_coef,
+        }
 
 
 Now we can use the :meth:`optimize_hyperparams` method
@@ -93,13 +94,13 @@ of :class:`~rlberry.manager.agent_manager.AgentManager` to find good parameters
     # Run optimization and print results
     stats.optimize_hyperparams(
         n_trials=100,
-        timeout=10,   # stop after 10 seconds
+        timeout=10,  # stop after 10 seconds
         n_fit=2,
-        sampler_method='optuna_default'
+        sampler_method="optuna_default",
     )
 
     print(stats.best_hyperparams)
 
     # Calling fit() again will train the agent with the optimized parameters
     stats.fit()
-    plot_writer_data(stats, tag='episode_rewards')
+    plot_writer_data(stats, tag="episode_rewards")
diff --git a/docs/basics/quick_start_rl/quickstart.rst b/docs/basics/quick_start_rl/quickstart.rst
@@ -1,7 +1,7 @@
 .. _quick_start:
 
 .. highlight:: none
-               
+
 Quick Start for Reinforcement Learning in rlberry
 =================================================
 
@@ -60,7 +60,7 @@ Let us see a graphical representation
 
 
 .. parsed-literal::
-              
+
     ffmpeg version n5.0 Copyright (c) 2000-2022 the FFmpeg developers
       built with gcc 11.2.0 (GCC)
       configuration: --prefix=/usr --disable-debug --disable-static --disable-stripping --enable-amf --enable-avisynth --enable-cuda-llvm --enable-lto --enable-fontconfig --enable-gmp --enable-gnutls --enable-gpl --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libdav1d --enable-libdrm --enable-libfreetype --enable-libfribidi --enable-libgsm --enable-libiec61883 --enable-libjack --enable-libmfx --enable-libmodplug --enable-libmp3lame --enable-libopencore_amrnb --enable-libopencore_amrwb --enable-libopenjpeg --enable-libopus --enable-libpulse --enable-librav1e --enable-librsvg --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libsvtav1 --enable-libtheora --enable-libv4l2 --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxcb --enable-libxml2 --enable-libxvid --enable-libzimg --enable-nvdec --enable-nvenc --enable-shared --enable-version3
-Original file line number
+Diff line change
@@ Expand Up / @@ -2,7 +2,7 @@ Quickstart for Deep Reinforcement Learning in rlberry @@
     =====================================================
     .. highlight:: none
     ..
       Authors: Riccardo Della Vecchia, Hector Kohler, Alena Shilova.
@@ Expand Down @@