Skip to content

Commit 347b8a4

Browse files
sayankotorsimflin
authored andcommitted
seminar03
1 parent 123b524 commit 347b8a4

10 files changed

+1554
-0
lines changed
Loading

week03_convnets/README.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
__Note__: Seminars assume that you remember batch normalization and dropout from last lecture. If you don't, go recap week2.
2+
3+
## Materials
4+
- [russian] Convolutional networks - [video](https://yadi.sk/i/hDIkaR4H3EtnXM)
5+
- [english] Convolutional networks (karpathy) - [video](https://www.youtube.com/watch?v=AQirPKrAyDg)
6+
7+
- Reading
8+
- http://cs231n.github.io/convolutional-networks/
9+
- http://cs231n.github.io/understanding-cnn/
10+
- [a deep learning neophite cheat sheet](http://www.kdnuggets.com/2016/03/must-know-tips-deep-learning-part-1.html)
11+
- [more stuff for vision](https://bavm2013.splashthat.com/img/events/46439/assets/34a7.ranzato.pdf)
12+
- a [CNN trainer in a browser](https://cs.stanford.edu/people/karpathy/convnetjs/demo/cifar10.html)
13+
14+
15+
## Assignment
16+
17+
As usual, go to seminar_pytorch.ipynb and folow instructons from there.
18+
19+
There's also ./other_frameworks if you're more into theano/tf.
Lines changed: 156 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,156 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "code",
5+
"execution_count": 1,
6+
"metadata": {
7+
"collapsed": true
8+
},
9+
"outputs": [],
10+
"source": [
11+
"import torch, torch.nn as nn\n",
12+
"import torch.nn.functional as F\n",
13+
"from torch.autograd import Variable\n",
14+
"\n",
15+
"# a special module that converts [batch, channel, w, h] to [batch, units]\n",
16+
"class Flatten(nn.Module):\n",
17+
" def forward(self, input):\n",
18+
" return input.view(input.size(0), -1)"
19+
]
20+
},
21+
{
22+
"cell_type": "code",
23+
"execution_count": 6,
24+
"metadata": {
25+
"collapsed": true
26+
},
27+
"outputs": [],
28+
"source": [
29+
"# assuming input shape [batch, 3, 64, 64]\n",
30+
"cnn = nn.Sequential(\n",
31+
" nn.Conv2d(in_channels=3, out_channels=2048, kernel_size=(3,3)),\n",
32+
" nn.Conv2d(in_channels=2048, out_channels=1024, kernel_size=(3,3)),\n",
33+
" nn.Conv2d(in_channels=1024, out_channels=512, kernel_size=(3,3)),\n",
34+
" nn.ReLU(),\n",
35+
" nn.MaxPool2d((6,6)),\n",
36+
" nn.Conv2d(in_channels=6, out_channels=32, kernel_size=(20,20)),\n",
37+
" nn.Conv2d(in_channels=32, out_channels=64, kernel_size=(20,20)),\n",
38+
" nn.Conv2d(in_channels=64, out_channels=128, kernel_size=(20,20)),\n",
39+
" nn.Softmax(),\n",
40+
" Flatten(),\n",
41+
" nn.Linear(64, 256),\n",
42+
" nn.Softmax(),\n",
43+
" nn.Linear(256, 10),\n",
44+
" nn.Sigmoid(),\n",
45+
" nn.Dropout(0.5)\n",
46+
" \n",
47+
")\n"
48+
]
49+
},
50+
{
51+
"cell_type": "markdown",
52+
"metadata": {},
53+
"source": [
54+
"```\n",
55+
"\n",
56+
"```\n",
57+
"\n",
58+
"```\n",
59+
"\n",
60+
"```\n",
61+
"\n",
62+
"```\n",
63+
"\n",
64+
"```\n",
65+
"\n",
66+
"```\n",
67+
"\n",
68+
"```\n",
69+
"\n",
70+
"```\n",
71+
"\n",
72+
"```\n",
73+
"\n",
74+
"```\n",
75+
"\n",
76+
"```\n",
77+
"\n",
78+
"```\n",
79+
"\n",
80+
"```\n",
81+
"\n",
82+
"```\n",
83+
"\n",
84+
"```\n",
85+
"\n",
86+
"```\n",
87+
"\n",
88+
"```\n",
89+
"\n",
90+
"```\n",
91+
"\n",
92+
"```\n",
93+
"\n",
94+
"```\n",
95+
"\n",
96+
"```\n",
97+
"\n",
98+
"```\n",
99+
"\n",
100+
"```\n",
101+
"\n",
102+
"```\n",
103+
"\n",
104+
"```\n",
105+
"\n",
106+
"```\n",
107+
"\n",
108+
"```\n",
109+
"\n",
110+
"```\n",
111+
"\n",
112+
"```\n",
113+
"\n",
114+
"\n",
115+
"# Book of grudges\n",
116+
"* Input channels are wrong literally half the time (after pooling, after flatten).\n",
117+
"* Too many filters for first 3x3 convolution - will lead to enormous matrix while there's just not enough relevant combinations of 3x3 images (overkill).\n",
118+
"* Usually the further you go, the more filters you need.\n",
119+
"* large filters (10x10 is generally a bad pactice, and you definitely need more than 10 of them\n",
120+
"* the second of 10x10 convolution gets 8x6x6 image as input, so it's technically unable to perform such convolution.\n",
121+
"* Softmax nonlinearity effectively makes only 1 or a few neurons from the entire layer to \"fire\", rendering 512-neuron layer almost useless. Softmax at the output layer is okay though\n",
122+
"* Dropout after probability prediciton is just lame. A few random classes get probability of 0, so your probabilities no longer sum to 1 and crossentropy goes -inf."
123+
]
124+
},
125+
{
126+
"cell_type": "code",
127+
"execution_count": null,
128+
"metadata": {
129+
"collapsed": true
130+
},
131+
"outputs": [],
132+
"source": []
133+
}
134+
],
135+
"metadata": {
136+
"kernelspec": {
137+
"display_name": "Python 3",
138+
"language": "python",
139+
"name": "python3"
140+
},
141+
"language_info": {
142+
"codemirror_mode": {
143+
"name": "ipython",
144+
"version": 3
145+
},
146+
"file_extension": ".py",
147+
"mimetype": "text/x-python",
148+
"name": "python",
149+
"nbconvert_exporter": "python",
150+
"pygments_lexer": "ipython3",
151+
"version": "3.6.4"
152+
}
153+
},
154+
"nbformat": 4,
155+
"nbformat_minor": 1
156+
}

week03_convnets/load_images.py

Whitespace-only changes.

0 commit comments

Comments
 (0)