A few weeks ago we read and discussed two papers extending the Variational Autoencoder (VAE) framework: “Importance Weighted Autoencoders” (Burda et al. 2016) and “Adversarial Autoencoders” (Makhzani et al. 2016). The former proposes a tighter lower bound on the marginal log-likelihood than the variational lower bound optimized by standard variational autoencoders. The latter replaces the KL divergence term — between the approximate posterior and prior distributions over latent codes — in the variational lower bound with a generative adversarial network (GAN) that encourages the aggregated posterior to match the prior distribution. In doing so, both extensions aim to improve the standard VAE’s ability to model in complex posterior distributions.
In this week’s session, Yixin led our discussion of two papers about Generative Adversarial Networks (GANs). The first paper,
“Generalization and Equilibrium in Generative Adversarial Nets” by Arora et al. , is a theoretical investigation
of GANs, and the second paper, “Improved Training of Wasserstein GANs” by Gulrajani et al. , gives an new training
method of Wasserstein GAN. This video gives a good explanation of the first paper.
This week Christian led our discussion of two papers relating to MCMC: “A Complete Recipe for Stochastic Gradient MCMC” (Ma et al. 2015) and “Relativistic Monte Carlo” (Lu et al. 2017).
The former provides a general recipe for constructing Markov Chain Monte Carlo (MCMC) samplers—including stochastic gradient versions—based on continuous Markov processes, thus unifying and generalizing earlier work.
The latter presents a version of Hamiltonian Monte Carlo (HMC) based on relativistic dynamics that introduce a maximum velocity on particles. It is more stable and robust to the choice of parameters compared to the Newtonian counterpart and shows similarity to popular stochastic optimizers such as Adam and RMSProp.