Skip to content

Commit 651efdb

Browse files
authored
Update README.md
1 parent a175461 commit 651efdb

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

Week5/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ PPO is a policy gradient method that differently from the vanilla implementation
66

77
For the DQN implementation and the choose of the hyperparameters, I mostly followed the [paper](https://arxiv.org/pdf/1707.06347.pdf). (In the last page there is a table with all the hyperparameters.). In case you want to fine-tune them, check out [Training with Proximal Policy Optimization](https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Training-PPO.md)
88

9-
### [Learn the theory behind PPO](../README.md)
9+
### [Learn the theory behind PPO](https://github.com/andri27-ts/60_Days_RL_Challenge/blob/master/README.md#week-5---advanced-policy-gradients---trpo--ppo)
1010

1111

1212
## Results

0 commit comments

Comments
 (0)