tianshou.core.utils¶

tianshou.core.utils.get_soft_update_op(update_fraction, including_nets, excluding_nets=None)[source]¶

Builds the graph op to softly update the “old net” of policies and value_functions, as suggested in DDPG. It updates the tf.Variable s in the old net, \(\theta'\) with the tf.Variable s in the current network, \(\theta\) as \(\theta' = \tau \theta + (1 - \tau) \theta'\).

Parameters:

update_fraction – A float in range \([0, 1]\). Corresponding to the \(\tau\) in the update equation.
including_nets – A list of policies and/or value_functions. All tf.Variable s in these networks are included for update. Shared Variables are updated only once in case of layer sharing among the networks.
excluding_nets – Optional. A list of policies and/or value_functions defaulting to None. All tf.Variable s in these networks are excluded from the update determined by including nets. This is useful in existence of layer sharing among networks and we only want to update the Variables in including_nets that are not shared.

Returns:

A list of ops tf.assign() specifying the soft update.