Skip to content

Commit 7f5bfd3

Browse files
authored
Code files added
1 parent 493297d commit 7f5bfd3

9 files changed

+2096
-0
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,333 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {},
6+
"source": [
7+
"# Demystifying GAN Loss Function"
8+
]
9+
},
10+
{
11+
"cell_type": "markdown",
12+
"metadata": {},
13+
"source": [
14+
"Now that we have understood how GANs work in detail, we will examine the loss function of GAN. Before going ahead let us recap the notations. \n",
15+
"\n",
16+
"* A noise which is fed as an input to the generator is represented by $z$ \n",
17+
"\n",
18+
"* Uniform or normal distribution from which the noise $z$ is sampled is represented by $p_z$\n",
19+
"\n",
20+
"* An input image is represented by $x$\n",
21+
"\n",
22+
"* Real data distribution i.e distribution of our training set is represented by $p_r$\n",
23+
"\n",
24+
"* Fake data distribution i.e distribution of the generator is represented by $p_g$\n",
25+
"\n",
26+
"When we write, $x \\sim p_{r}(x)$ , it implies that image $x$ is sampled from the real distribution $p_r$\n",
27+
". Similarly, $x \\sim p_{g}(x)$ denotes that image $x$ is sampled from the generator\n",
28+
"distribution $p_g$ and $z \\sim p_{z}(z)$ implies that the generator input $z$ is sampled from the\n",
29+
"uniform distribution $p_z$."
30+
]
31+
},
32+
{
33+
"cell_type": "markdown",
34+
"metadata": {},
35+
"source": [
36+
"As we learned that both the generator and discriminator are neural networks and both of\n",
37+
"them update their parameters through backpropagation. We need to find the\n",
38+
"optimal generator parameter $\\theta_g $ and discriminator parameter $\\theta_d$."
39+
]
40+
},
41+
{
42+
"cell_type": "markdown",
43+
"metadata": {},
44+
"source": [
45+
"## Discriminator Loss "
46+
]
47+
},
48+
{
49+
"cell_type": "markdown",
50+
"metadata": {},
51+
"source": [
52+
"Now we will see the loss function of the discriminator. We know that the goal of the\n",
53+
"discriminator is to classify whether the image is real or fake image. Let us denote\n",
54+
"discriminator by $D$.\n",
55+
"\n",
56+
"The loss function of the discriminator is given as, \n",
57+
"\n",
58+
"$$\\max _{d} L(D, G)=\\mathbb{E}_{x \\sim p_{r}(x)}\\left[\\log D\\left(x ; \\theta_{d}\\right)\\right]+\\mathbb{E}_{z \\sim p_{z}(z)}\\left[\\log \\left(1-D\\left(G\\left(z ; \\theta_{g}\\right) ; \\theta_{d}\\right)\\right)\\right]$$"
59+
]
60+
},
61+
{
62+
"cell_type": "markdown",
63+
"metadata": {},
64+
"source": [
65+
"What does this mean though? Let us see each term by term. \n",
66+
"\n",
67+
"### First term\n",
68+
"\n",
69+
"Let us look at the first term,\n",
70+
"\n",
71+
"$$ \\mathbb{E}_{x \\sim p_{r}} \\log (D(x))$$"
72+
]
73+
},
74+
{
75+
"cell_type": "markdown",
76+
"metadata": {},
77+
"source": [
78+
"* $x \\sim p_{r}(x)$ implies we are sampling input $x$ from the real data distribution $p_r$, so $x$ is a\n",
79+
"real image. \n",
80+
"\n",
81+
"* $D(x)$ implies that we are feeding the input image $x$ to the discriminator $D$ and it will\n",
82+
"return the probability of input image $x$ to be a real image. \n",
83+
"\n",
84+
"Since we know that $x$ is a real image i.e from a real data distribution, we need to maximize the probability of $D(x)$:\n",
85+
"\n",
86+
"$$\\max D(x)$$\n",
87+
"\n",
88+
"But instead of maximizing raw probabilities we maximize log probabilities as we learned in\n",
89+
"chapter 7, we can write, \n",
90+
"\n",
91+
"$$ \\max \\log D(x)$$\n",
92+
"\n",
93+
"So our final equation becomes:\n",
94+
"\n",
95+
"$$\\max \\mathbb{E}_{x \\sim p_{r}(x)}[\\log D(x)]$$\n",
96+
"\n",
97+
"__$\\mathbb{E}_{x \\sim p_{r}(x)}[\\log D(x)]$ implies the expectations of the log likelihood of\n",
98+
"input images sampled from the real data distribution being real.__"
99+
]
100+
},
101+
{
102+
"cell_type": "markdown",
103+
"metadata": {},
104+
"source": [
105+
"### Second term"
106+
]
107+
},
108+
{
109+
"cell_type": "markdown",
110+
"metadata": {},
111+
"source": [
112+
"Now, let us look at the second term\n",
113+
"\n",
114+
"$$\\mathbb{E}_{z \\sim p_{(z)}}[\\log (1-D(G(z)))] $$\n",
115+
"\n",
116+
"\n",
117+
"* $z \\sim p_{z}(z)$ implies we are sampling a random noise $z$ from the uniform distribution $p_z$.\n",
118+
"\n",
119+
"* $G(z)$ implies that the generator $G$ takes the random noise $z$ as an input and returns an\n",
120+
"image based on its implicitly learned distribution $p_g$.\n",
121+
"\n",
122+
"* $D(G(z))$ implies we are feeding the image generated by the generator to the\n",
123+
"discriminator $D$ and it will return the probability that input image to be a real image. \n",
124+
"\n",
125+
"\n",
126+
"If we subtract 1 from $D(G(z))$ then it will return the probability of the input image being\n",
127+
"a fake image.\n",
128+
"\n",
129+
"$$1-D(G(z))$$\n",
130+
"\n",
131+
"Since we know $z$ is not a real image, the discriminator will maximize this probability,\n",
132+
"ie discriminator maximizes the probability $z$ of being classified as a fake image. So we write\n",
133+
"\n",
134+
"$\\max 1-D(G(z))$\n",
135+
"\n",
136+
"Instead of maximizing raw probabilities, we maximize the log probability, so we write,\n",
137+
"\n",
138+
"$$ \\max \\log (1-D(G(z)))$$\n",
139+
"\n",
140+
"__$\\mathbb{E}_{z \\sim p_{z}(z)}[\\log (1-D(G(z)))]_{\\mathrm{i}}$ implies the expectations i.e expectations of the log\n",
141+
"likelihood of input images generated by the generator being fake.__"
142+
]
143+
},
144+
{
145+
"cell_type": "markdown",
146+
"metadata": {},
147+
"source": [
148+
"### Final term"
149+
]
150+
},
151+
{
152+
"cell_type": "markdown",
153+
"metadata": {},
154+
"source": [
155+
"So, combining these two terms, loss function of the discriminator is given as,\n",
156+
"\n",
157+
"$$ \\max _{d} L(D, G)=\\mathbb{E}_{x \\sim p_{r}(x)}\\left[\\log D\\left(x ; \\theta_{d}\\right)\\right]+\\mathbb{E}_{z \\sim p_{z}(z)}\\left[\\log \\left(1-D\\left(G\\left(z ; \\theta_{g}\\right) ; \\theta_{d}\\right)\\right)\\right]$$\n",
158+
"\n",
159+
"Where $\\theta_d$ and $\\theta_g$ are the parameters of the discriminator and generator network\n",
160+
"respectively"
161+
]
162+
},
163+
{
164+
"cell_type": "markdown",
165+
"metadata": {},
166+
"source": [
167+
"## Generator loss\n",
168+
"\n",
169+
"The loss function of the generator can be given as,"
170+
]
171+
},
172+
{
173+
"cell_type": "markdown",
174+
"metadata": {},
175+
"source": [
176+
"$$ \\min _{g} L(D, G)=\\mathbb{E}_{z \\sim p_{z}(z)}\\left[\\log \\left(1-D\\left(G\\left(z ; \\theta_{g}\\right) ; \\theta_{d}\\right)\\right)\\right]$$\n",
177+
"\n",
178+
"We know that the goal of the generator is to fool the discriminator to classify the fake image\n",
179+
"as a real image. \n",
180+
"\n",
181+
"In the previous section, we saw, $\\mathbb{E}_{z \\sim p_{z}(z)}[\\log (1-D(G(z)))]_{\\mathrm{}}$ implies the probability of classifying the input image as a\n",
182+
"fake image and the discriminator maximizes this probabilities for correctly classifying the\n",
183+
"fake image as fake. \n",
184+
"\n",
185+
"\n",
186+
"But the generator wants to minimize this probability. As the generator wants to fool the\n",
187+
"discriminator, it minimizes this probability of input image being classified as fake. The loss\n",
188+
"function of the generator can be given as,\n",
189+
"\n",
190+
"$$\\min _{g} L(D, G)=\\mathbb{E}_{z \\sim p_{z}(z)}\\left[\\log \\left(1-D\\left(G\\left(z ; \\theta_{g}\\right) ; \\theta_{d}\\right)\\right)\\right]$$"
191+
]
192+
},
193+
{
194+
"cell_type": "markdown",
195+
"metadata": {},
196+
"source": [
197+
"## Total Loss\n",
198+
"\n",
199+
"\n",
200+
"We just learned the loss function of generator and discriminator, combining these two\n",
201+
"losses, we write our final loss function can be written as,"
202+
]
203+
},
204+
{
205+
"cell_type": "markdown",
206+
"metadata": {},
207+
"source": [
208+
"$$ \\min _{G} \\max _{D} L(D, G)=\\mathbb{E}_{x \\sim p_{r}(x)}[\\log D(x)]+\\mathbb{E}_{z \\sim p_{z}(z)}[\\log (1-D(G(z)))]$$\n",
209+
"\n",
210+
"\n",
211+
"So our objective function is basically a min-max objective function i.e maximization for the\n",
212+
"discriminator and minimization for the generator and we find the optimal generator\n",
213+
"parameter $\\theta_g$ and discriminator parameter $\\theta_d$ through backpropagating the respective\n",
214+
"networks.\n",
215+
"\n"
216+
]
217+
},
218+
{
219+
"cell_type": "markdown",
220+
"metadata": {},
221+
"source": [
222+
"So we perform gradient ascent i.e maximization on the discriminator and update the discriminator parameter $\\theta_d$:\n",
223+
" \n",
224+
" $$ \\nabla_{\\theta_{d}} \\frac{1}{m} \\sum_{i=1}^{m}\\left[\\log D\\left(\\boldsymbol{x}^{(i)}\\right)+\\log \\left(1-D\\left(G\\left(\\boldsymbol{z}^{(i)}\\right)\\right)\\right)\\right]$$\n",
225+
" \n",
226+
" \n",
227+
"And gradient descent i.e minimization on the generator and update the generator parameter $\\theta_g$:\n",
228+
"\n",
229+
"$$\\nabla_{\\theta_{g}} \\frac{1}{m} \\sum_{i=1}^{m} \\log \\left(1-D\\left(G\\left(\\boldsymbol{z}^{(i)}\\right)\\right)\\right)$$"
230+
]
231+
},
232+
{
233+
"cell_type": "markdown",
234+
"metadata": {},
235+
"source": [
236+
"However, optimizing the above generator objective does not work properly and causes a\n",
237+
"stability issue. So we introduce a new form of loss called heuristic loss. "
238+
]
239+
},
240+
{
241+
"cell_type": "markdown",
242+
"metadata": {},
243+
"source": [
244+
"## Heuristic Loss\n",
245+
"\n",
246+
"There is no change in the loss function of the discriminator it is written as,\n",
247+
"\n",
248+
"$$ \\max _{d} L(D, G)=\\mathbb{E}_{x \\sim p_{r}(x)}\\left[\\log D\\left(x ; \\theta_{d}\\right)\\right]+\\mathbb{E}_{z \\sim p_{z}(z)}\\left[\\log \\left(1-D\\left(G\\left(z ; \\theta_{g}\\right) ; \\theta_{d}\\right)\\right)\\right]$$"
249+
]
250+
},
251+
{
252+
"cell_type": "markdown",
253+
"metadata": {},
254+
"source": [
255+
"Now, let us look at the generator loss, \n",
256+
"\n",
257+
"$$ \\min _{g} L(D, G)=\\mathbb{E}_{z \\sim p_{z}(z)}\\left[\\log \\left(1-D\\left(G\\left(z ; \\theta_{g}\\right) ; \\theta_{d}\\right)\\right)\\right] $$"
258+
]
259+
},
260+
{
261+
"cell_type": "markdown",
262+
"metadata": {},
263+
"source": [
264+
"Can we change it to a maximizing equation just like our discriminators? How can we do\n",
265+
"that? We know that $ 1-D(G(Z)$ returns the probability of input image being fake and\n",
266+
"generator is minimizing this probability. \n",
267+
"\n",
268+
"\n",
269+
"Instead of doing this, we can write $D(G(z))$ it implies the probability of input image\n",
270+
"being real and now our generator can maximize this probability. It implies a generator is\n",
271+
"maxing the probability of the input fake image being classified as a real image. So the loss\n",
272+
"function of our generator now becomes,\n",
273+
"\n",
274+
"$$\\max _{g} L(D, G)=\\mathbb{E}_{z \\sim p_{z}(z)}\\left[\\log \\left(D\\left(G\\left(z ; \\theta_{g}\\right) ; \\theta_{d}\\right)\\right)\\right]$$\n",
275+
"\n",
276+
"So, now we have both the loss function of our discriminator and generator as maximizing\n",
277+
"terms i.e,\n",
278+
"\n",
279+
"$$\\max _{d} L(D, G)=\\mathbb{E}_{x \\sim p_{r}(x)}\\left[\\log D\\left(x ; \\theta_{d}\\right)\\right]+\\mathbb{E}_{z \\sim p_{z}(z)}\\left[\\log \\left(1-D\\left(G\\left(z ; \\theta_{g}\\right) ; \\theta_{d}\\right)\\right)\\right]$$\n",
280+
"\n",
281+
"$$\\max _{g} L(D, G)=\\mathbb{E}_{z \\sim p_{z}(z)}\\left[\\log \\left(D\\left(G\\left(z ; \\theta_{g}\\right) ; \\theta_{d}\\right)\\right)\\right]$$\n",
282+
"\n",
283+
"\n",
284+
"\n",
285+
"But instead of maximizing, if we can minimize the loss then we can apply our favorite\n",
286+
"gradient descent algorithms. Now how can we convert our maximizing problem into a\n",
287+
"minimization problem? It;'s so simple, just add a negative sign.\n",
288+
"So, our final loss function for the discriminator is given as,\n",
289+
"\n",
290+
"\n",
291+
"$$ \\boxed{L^{D}=-\\mathbb{E}_{x \\sim p_{r}(x)}[\\log D(x)]-\\mathbb{E}_{z \\sim p_{z}(z)}[\\log (1-D(G(z))]}$$\n"
292+
]
293+
},
294+
{
295+
"cell_type": "markdown",
296+
"metadata": {},
297+
"source": [
298+
"\n",
299+
"and the generator loss is,\n",
300+
"\n",
301+
"$$ \\boxed{L^{G}=-\\mathbb{E}_{z \\sim p_{z}(z)}[\\log (D(G(z)))]}$$"
302+
]
303+
},
304+
{
305+
"cell_type": "markdown",
306+
"metadata": {},
307+
"source": [
308+
"In the next section, we will learn how to use GAN to generate images of handwritten digits. "
309+
]
310+
}
311+
],
312+
"metadata": {
313+
"kernelspec": {
314+
"display_name": "Python 2",
315+
"language": "python",
316+
"name": "python2"
317+
},
318+
"language_info": {
319+
"codemirror_mode": {
320+
"name": "ipython",
321+
"version": 2
322+
},
323+
"file_extension": ".py",
324+
"mimetype": "text/x-python",
325+
"name": "python",
326+
"nbconvert_exporter": "python",
327+
"pygments_lexer": "ipython2",
328+
"version": "2.7.12"
329+
}
330+
},
331+
"nbformat": 4,
332+
"nbformat_minor": 2
333+
}

0 commit comments

Comments
 (0)