Skip to content

zaza0209/GEERL

Repository files navigation

Generalized Fitted Q-iteration with Clustered Data

This repository contains the implementation for the paper "Generalized Fitted Q-iteration with Clustered Data" in Python. This paper focuses on reinforcement learning (RL) in clustered environments with limited data, a common scenario in healthcare applications. We propose an optimal policy learning algorithm that integrates Generalized Estimating Equations (GEE) into the Bellman equation framework to account for intra-cluster correlations. Our approach not only minimizes the variance of the Q-function estimator but also ensures that the derived policy achieves minimal regret.

We illustrate the motivation behind the proposed approach through a simple tabular example where the optimal Q-function is analytically known (See Section 3.1 for details). Increasing the variance of the Q-function leads to higher regret in the derived policies. drawing

File Overview

  • Folder functions/:

    • generate_joint_data: Generates data for simulation.
    • GEE_Q: Implements the generatlized Fitted Q iteration (FQI) and the optimal FQI with GEE.
    • cov_struct: Contains several correlation structures for GEE.
    • utilities: Contains some helping functions.
  • Folder simulation/:

    • R_autoex: Runs the generalized FQI and the proposed optimal FQI with different within cluster correlation structures.
    • create_r_autoex.sh: Creates SLURM jobs to run R_autoex.py.
    • Qonline_single.py: Run online DQN to approximate the optimal Q function.
    • create_online_Q.sh: Creates SLURM jobs to run Qonline_single.py.
  • Folder simulation/regret:

    • value_comparison.py: Estimates the regret of the optimal Q function with different noise variance.
    • create_value_comparison.sh: Creates SLURM jobs to run value_comparison.py.
  • Folder semi/codes/:

    • run_individual_rl_ihs.py: Run the semisynthetic simulation based on IHS dataset.
    • run_online_learning.py: Run the online policy learning on semisynthetic dataset.
  • Folder semi/models: Include the learned transition and reward functions learned from IHS dataset.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •