You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When we deal with episodic environments (say CartPole), each episode terminates with different lengths in different random runs due to learning progress and randomness.
What might be the good way to plot uncertainty over different runs, and we want to have total timesteps as x-values (interactions with environment) which is the standard curve to address sample complexity in the community.
e.g. A curve like this
A naive solution is to fit a polynomial for each curve and generate new data points for consistent x-values, however, this might be very misleading because some of the curves might be very sparse in the data point.
When we deal with episodic environments (say CartPole), each episode terminates with different lengths in different random runs due to learning progress and randomness.
What might be the good way to plot uncertainty over different runs, and we want to have total timesteps as x-values (interactions with environment) which is the standard curve to address sample complexity in the community.
e.g. A curve like this

A naive solution is to fit a polynomial for each curve and generate new data points for consistent x-values, however, this might be very misleading because some of the curves might be very sparse in the data point.
What do you think @jachiam ?
The text was updated successfully, but these errors were encountered: