Skip to content

Commit ae91ee1

Browse files
committed
week10 rl
1 parent 4d9057d commit ae91ee1

16 files changed

+3911
-0
lines changed

week10_rl/README.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
## Materials
2+
* [Slides](https://yadi.sk/d/GG-GvN-13UhzFw)
3+
* Video lecture by D. Silver - https://www.youtube.com/watch?v=KHZVXao4qXs
4+
* Our [lecture](https://yadi.sk/i/I3M09HKQ3GKBiP), [seminar](https://yadi.sk/i/8f9NX_E73GKBkT)
5+
* Alternative lecture by J. Schulman part 1 - https://www.youtube.com/watch?v=BB-BhTn6DCM
6+
* Alternative lecture by J. Schulman part 2 - https://www.youtube.com/watch?v=Wnl-Qh2UHGg
7+
8+
9+
## More materials
10+
* A full-term course on reinforcement learning - [practical_rl](https://github.com/yandexdataschool/practical_rl)
11+
12+
* Actually proving the policy gradient for discounted rewards - [article](https://papers.nips.cc/paper/1713-policy-gradient-methods-for-reinforcement-learning-with-function-approximation.pdf)
13+
* On variance of policy gradient and optimal baselines: [article](https://papers.nips.cc/paper/4264-analysis-and-improvement-of-policy-gradient-estimation.pdf), another [article](https://arxiv.org/pdf/1301.2315.pdf)
14+
* Generalized Advantage Estimation - a way you can speed up training for homework_*.ipynb - [article](https://arxiv.org/abs/1506.02438)
15+
16+
* Generalizing log-derivative trick - [url](http://blog.shakirm.com/2015/11/machine-learning-trick-of-the-day-5-log-derivative-trick/)
17+
* Combining policy gradient and q-learning - [arxiv](https://arxiv.org/abs/1611.01626)
18+
* Bayesian perspective on why reparameterization & logderivative tricks matter (Vetrov's take) - [pdf](https://www.sdsj.ru/slides/Vetrov.pdf)
19+
* Adversarial review of policy gradient - [blog](http://www.argmin.net/2018/02/20/reinforce/)
20+

week10_rl/optional/atari_lasagne.ipynb

Lines changed: 917 additions & 0 deletions
Large diffs are not rendered by default.

week10_rl/optional/atari_pytorch.ipynb

Lines changed: 671 additions & 0 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)