Skip to content

Commit e9dbef6

Browse files
authored
Subtract the entropy to encourage exploration.
1 parent 949b198 commit e9dbef6

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

research/a3c_blogpost/a3c_cartpole.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -353,7 +353,7 @@ def compute_loss(self,
353353
policy_loss = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=memory.actions,
354354
logits=logits)
355355
policy_loss *= tf.stop_gradient(advantage)
356-
policy_loss = 0.01 * entropy
356+
policy_loss -= 0.01 * entropy
357357
total_loss = tf.reduce_mean((0.5 * value_loss + policy_loss))
358358
return total_loss
359359

0 commit comments

Comments
 (0)