tianshou.core.random¶
adapted from keras-rl
-
class
tianshou.core.random.
GaussianWhiteNoiseProcess
(mu=0.0, sigma=1.0, sigma_min=None, n_steps_annealing=1000, size=1)[source]¶ Bases:
tianshou.core.random.AnnealedGaussianProcess
Class for Gaussian white noise. At each timestep, the class samples from an exact Gaussian distribution. It allows annealing in the std of the Gaussian, but the distribution is independent at different timesteps.
Parameters: - mu – A float defaulting to 0. Specifying the mean of the Gaussian-like distribution.
- sigma – A float defaulting to 1. Specifying the std of the Gaussian-like distribution.
- sigma_min – Optional. A float. Specifying the minimum std until which the annealing stops. It defaults to
None
where no annealing takes place. - n_steps_annealing – Optional. An int. It specifies the total number of steps for which the annealing happens.
Only effective when
sigma_mean
is notNone
. - size – An int or tuple of ints. It corresponds to the shape of the action of the environment.
-
class
tianshou.core.random.
OrnsteinUhlenbeckProcess
(theta, mu=0.0, sigma=1.0, dt=0.01, x0=None, size=1, sigma_min=None, n_steps_annealing=1000)[source]¶ Bases:
tianshou.core.random.AnnealedGaussianProcess
Class for Ornstein-Uhlenbeck Process, as used for exploration in DDPG. Implemented based on http://math.stackexchange.com/questions/1287634/implementing-ornstein-uhlenbeck-in-matlab . It basically is a temporal-correlated Gaussian process where the distribution at the current timestep depends on the samples from the last timestep. It’s not exactly Gaussian but still resembles Gaussian.
Parameters: - theta – A float. A special parameter for this process.
- mu – A float. Another parameter of this process, but it’s not exactly the mean of the distribution.
- sigma – A float. Another parameter of this process. It acts like the std of the Gaussian-like distribution to some extent.
- dt – A float. The time interval to simulate this process discretely, as the process is mathematically defined to be a continuous one.
- x0 – Optional. A float. The initial value of “the samples from the last timestep” so as to draw the first sample. It defaults to zero.
- size – An int or tuple of ints. It corresponds to the shape of the action of the environment.
- sigma_min – Optional. A float. Specifying the minimum std until which the annealing stops. It defaults to
None
where no annealing takes place. - n_steps_annealing – An int. It specifies the total number of steps for which the annealing happens.