Name: CSC421/2516 Homework 5 Solved
SKU: 41786
Availability: InStock

Description

5/5 - (1 vote)

Submission: You must submit your solutions as a PDF through MarkUs. You can produce the file however you like (e.g. LaTeX, Microsoft Word, scanner) as long as it is readable.

Late Submission: MarkUs will remain open until 3 days after the deadline, after which no late submissions will be accepted. The late penalty is 10% per day, rounded up.

Weekly homeworks are individual work. See the Course Information handout^[1] for detailed policies.

Due to the shortened time period, this assignment has only one question, worth 6 points. You get the remaining 4 points for free.

Variational Free Energy [6pts] Here, your job is to derive some of the formulas relating to the variational free energy (VFE) which we maximize when we train a VAE. Recall that the VFE is defined as:

F(q) = Eq[logp(x|z)] − D_KL(q(z)kp(z)),

and KL divergence is defined as

D_KL(q(z)kp(z)) = Eq[logq(z) − logp(z)].

We assume the prior p(z) is a standard Gaussian:

D D

p(z) = N(z;0,I) = ^Yp_i(z_i) = ^YN(z_i;0,1).

i=1 i=1

And the variational approximation q(z) is a fully factorized (i.e. diagonal) Gaussian:

D D

q(z) = N(z;µ,Σ) = ^Yq_i(z_i) = ^YN(z_i;µ_i,σ_i).

i=1 i=1

For reference, here are the formulas for the univariate and multivariate Gaussian distributions:

[1pt] Show that

F(q) = logp(x) − D_KL(q(z)kp(z|x)).

(Hint: expand out definitions and apply Bayes’ Rule.)

[1pt] Show that the KL term decomposes as a sum of KL terms for individual dimensions.

In particular,

D_KL(q(z)kp(z)) = ^XD_KL(q_i(z_i)kp_i(z_i)).

CSC421/2516 Winter 2019 Homework 5

[2pts] Give an explicit formula for the KL divergence D_KL(q_i(z_i)kp_i(z_i)). This should be a mathematical expression involving µ_iand σ_i. If you like, you may suppress the i subscripts in your solution.
[2pts] One way to do gradient descent on the KL term is to apply the formula from part (c). Another approach is to compute stochastic gradients using the reparameterization trick:

∇θDKL,

where

and

Show how to compute a stochastic estimate of ∇_θD_KL(q_i(z_i)kp_i(z_i)) by doing backprop on the above equations. You may find it helpful to draw the computation graph. If you like, you may suppress the i subscripts in your solution.

[1] http://www.cs.toronto.edu/~rgrosse/courses/csc421_2019/syllabus.pdf

[SOLVED] CSC421/2516 Homework 5

If Helpful Share:

Description

Related products

CSC421/2516 Programming Assignment 1: Learning Distributed Word Representations

CSC421/2516 Programming Assignment 4: CycleGAN

CSC421/2516 Homework 2

Related in this category

More in this category

CSC421/2516 Homework 1 Solved

CSC421/2516 Homework 4

CSC421/2516 Homework 2

CSC421/2516 Homework 3

CSC421/2516 Programming Assignment 2: Convolutional Neural Networks

CSC421 Programming Assignment 3: Attention-Based Neural Machine Translation