Here’s my outline from last week’s meeting along with a few more references.
Outline
- Generative Adversarial Networks [1]
- Introduction: Goal, learn (and sample) from a distribution.
- Previous attempts (e.g. Restricted Boltzman machine)
- Relation to noise contrastive estimation [7]
-
Algorithm
- f Divergences as a more general framework [2].
- What is a f divergence [9]
- Fenchel conjugate [10], derivation of a lower bound [11,12]
-
De-mystification of GANs
-
Alternative methods: Maximum Mean Discrepancy Optimization [6,7] based in the Kernel Two sample test from [13]
- Further discussion [3,4,5]
References
Main papers:
[1] Generative Adversarial Nets Goodfellow et al, NIPS 2014
[2] f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization Nowozin et al, NIPS 2016
Additional papers
[3] Improved Techniques for Training GANs Salimans et al, 2016, ArXiv
[4] On Distinguishability Criteria for Estimating Generative Models Goodfellow, 2015, ICLR Workshop
[5] Adversarially Learned Inference Dumolin et al, 2016, ArXiv
[6] Training generative neural networks via Maximum Mean Discrepancy optimization Dziugaite et al, UAI 2015 [7] Generative moment matching networks Li et al, ICML 2015
Related ideas from the ‘oldies’:
[8]Noise-contrastive estimation: A new estimation principle for unnormalized statistical models Gutmann and Hyvärinen, ICML 2010
[9]A general class of coefficients of divergence of one distribution from another. JRS Ali and Silvey, Journal of the Royal Statistical Society. Series B (Methodological) (1966): 131-142.
[10] Convex Analysis R. Tyrell Rockafellar Princeton University Press.
[11] Estimating divergence functionals and the likelihood ratio by convex risk minimization Nguyen et al, 2008 NIPS
[12] Random Variables, Monotone Relations and Convex Analysis Rockafellar and Royset, 2013
[13] A Kernel Two-Sample Test Gretton et al, Journal of Machine Learning Research (2012)