This is a list of suggested papers to choose from, loosely organized by topic.

Inference

  • Mnih, Andriy, and Danilo J. Rezende. “Variational inference for Monte Carlo objectives.” arXiv preprint arXiv:1602.06725 (2016). link
  • Kingma, Diederik P., and Max Welling. “Auto-encoding variational Bayes.”arXiv preprint arXiv:1312.6114 (2013). link
  • Rhee, Chang-han, and Peter W. Glynn. “Unbiased estimation with square root convergence for SDE models.” Operations Research 63.5 (2015): 1026-1043. link
  • Tripuraneni, Nilesh, et al. “Magnetic Hamiltonian Monte Carlo.” arXiv preprint arXiv:1607.02738 (2016). link
  • Grosse, Roger B., Siddharth Ancha, and Daniel M. Roy. “Measuring the reliability of MCMC inference with bidirectional Monte Carlo.” arXiv preprint arXiv:1606.02275 (2016). link
  • Kucukelbir, Alp, et al. “Automatic Differentiation Variational Inference.” arXiv preprint arXiv:1603.00788 (2016). link
  • Duvenaud, David, Dougal Maclaurin, and Ryan P. Adams. “Early Stopping as Nonparametric Variational Inference.” AISTATS (2016). link
  • Rudolph, Maja R., et al. “Exponential Family Embeddings.” arXiv preprint arXiv:1608.00778 (2016). link
  • Bouchard-Côté, Alexandre, Sebastian J. Vollmer, and Arnaud Doucet. “The Bouncy Particle Sampler: A Non-Reversible Rejection-Free Markov Chain Monte Carlo Method.” arXiv preprint arXiv:1510.02451 (2015). link
  • Pakman, Ari, et al. “Stochastic Bouncy Particle Sampler.” arXiv preprint arXiv:1609.00770 (2016). link
  • Giles, Mike, et al. “Multilevel Monte Carlo for Scalable Bayesian Computations.” arXiv preprint arXiv:1609.06144 (2016). link
  • Lopez-Paz, David, and Maxime Oquab. “Revisiting Classifier Two-Sample Tests.” arXiv preprint arXiv:1610.06545 (2016). link
  • He, Niao, et al. “Fast and Simple Optimization for Poisson Likelihood Models.” arXiv preprint arXiv:1608.01264 (2016). link

Theory

  • Arora, Sanjeev, et al. “Generalization and Equilibrium in Generative Adversarial Nets (GANs).” arXiv preprint arXiv:1703.00573 (2017). link
  • Hardt, Moritz, Tengyu Ma, and Benjamin Recht. “Gradient Descent Learns Linear Dynamical Systems.” arXiv preprint arXiv:1609.05191 (2016). link
  • Yang, Fanny, Sivaraman Balakrishnan, and Martin J. Wainwright. “Statistical and computational guarantees for the Baum-Welch algorithm.” 53rd Annual Allerton Conference on Communication, Control, and Computing. IEEE (2015). link
  • Chen, Yudong, and Martin J. Wainwright. “Fast low-rank estimation by projected gradient descent: General statistical and algorithmic guarantees.”arXiv preprint arXiv:1509.03025 (2015). link
  • Wibisono, Andre, Ashia C. Wilson, and Michael I. Jordan. “A Variational Perspective on Accelerated Methods in Optimization.” arXiv preprint arXiv:1603.04245 (2016). link
  • Chi, Jin et al. “Local Maxima in the Likelihood of Gaussian Mixture Models: Structural Results and Algorithmic Consequences.” arXiv preprint arXiv:1609.00978 (2016) link
  • Bloem-Reddy, Benjamin, and Peter Orbanz. “Random Walk Models of Network Formation and Sequential Monte Carlo Methods for Graphs.” arXiv preprint arXiv:1612.06404 (2016). link
  • Huszár, Ferenc. “How (not) to Train your Generative Model: Scheduled Sampling, Likelihood, Adversary?.” arXiv preprint arXiv:1511.05101 (2015). link
  • Mohamed, Shakir, and Balaji Lakshminarayanan. “Learning in Implicit Generative Models.” arXiv preprint arXiv:1610.03483 (2016). link

Deep Learning

  • Gulrajani, Ishaan, et al. “Improved Training of Wasserstein GANs.” arXiv preprint arXiv:1704.00028 (2017). link
  • Metz, Luke, et al. “Unrolled Generative Adversarial Networks.” arXiv preprint arXiv:1611.02163 (2016). link
  • Radford, Alec, Luke Metz, and Soumith Chintala. “Unsupervised representation learning with deep convolutional generative adversarial networks.” arXiv preprint arXiv:1511.06434 (2015). link
  • Graves, Alex, et al. “Hybrid computing using a neural network with dynamic external memory.” Nature 538.7626 (2016): 471-476. link
  • Chung, Junyoung, et al. “A recurrent latent variable model for sequential data.” Advances in neural information processing systems. 2015. link
  • Burda, Yuri, Roger Grosse, and Ruslan Salakhutdinov. “Importance weighted autoencoders.” arXiv preprint arXiv:1509.00519 (2015). link
  • He, Kaiming, et al. “Deep residual learning for image recognition.” arXiv preprint arXiv:1512.03385 (2015). link
  • Rezende, Danilo Jimenez, et al. “Unsupervised Learning of 3D Structure from Images.” arXiv preprint arXiv:1607.00662 (2016). link
  • Bowman, Samuel R., et al. “Generating sentences from a continuous space.”arXiv preprint arXiv:1511.06349 (2015). link
  • Dosovitskiy, Alexey, et al. “Learning to Generate Chairs, Tables and Cars with Convolutional Networks.” (2016). link
  • Maaløe, Lars, et al. “Auxiliary Deep Generative Models.” arXiv preprint arXiv:1602.05473 (2016). link
  • Mnih, Volodymyr, Nicolas Heess, and Alex Graves. “Recurrent models of visual attention.” Advances in Neural Information Processing Systems. 2014. link
  • Makhzani, Alireza, et al. “Adversarial autoencoders.” arXiv preprint arXiv:1511.05644 (2015). link
  • Gregor, Karol, et al. “Deep autoregressive networks.” arXiv preprint arXiv:1310.8499 (2013). link
  • Nguyen, Anh, et al. “Synthesizing the preferred inputs for neurons in neural networks via deep generator networks.” arXiv preprint arXiv:1605.09304 (2016). link
  • Kiros, Ryan, et al. “Skip-thought vectors.” Neural Information Processing Systems (NIPS). (2015). link
  • Mansimov, Elman, et al. “Generating images from captions with attention.”arXiv preprint arXiv:1511.02793 (2015). link
  • Salimans, Tim, et al. “Improved Techniques for Training GANs.” arXiv preprint arXiv:1606.03498 (2016). link
  • Nalisnick, Eric, and Padhraic Smyth. “Deep Generative Models with Stick-Breaking Priors.” arXiv preprint arXiv:1605.06197 (2016). link
  • Kulkarni, Tejas D., et al. “Deep convolutional inverse graphics network.” Neural Information Processing Systems (NIPS) (2015). link
  • Gatys, Leon A., Alexander S. Ecker, and Matthias Bethge. “A neural algorithm of artistic style.” arXiv preprint arXiv:1508.06576 (2015). link
  • Alain, Guillaume, and Yoshua Bengio. “Understanding intermediate layers using linear classifier probes.” arXiv preprint arXiv:1610.01644 (2016). link
  • Chen, Yutian, et al. “Learning to Learn for Global Optimization of Black Box Functions.” arXiv preprint arXiv:1611.03824 (2016). link
  • Rezende, Danilo Jimenez, Shakir Mohamed, and Daan Wierstra. “Stochastic Backpropagation and Approximate Inference in Deep Generative Models.” Proceedings of The 31st International Conference on Machine Learning. 2014. link

State Space Models

  • Foerster, Jakob N., et al. “Intelligible language modeling with input switched affine networks.” arXiv preprint arXiv:1611.09434 (2016). link
  • Fraccaro, Marco, et al. “Sequential Neural Models with Stochastic Layers.” Advances in Neural Information Processing Systems (2016). link
  • Chung, Junyoung, et al. “A recurrent latent variable model for sequential data.” Advances in neural information processing systems. 2015. link
  • Schein, Aaron, Hanna Wallach, and Mingyuan Zhou. “Poisson-Gamma dynamical systems.” Advances in Neural Information Processing Systems (2016). link
  • Winner, Kevin, and Daniel R. Sheldon. “Probabilistic Inference with Generating Functions for Poisson Latent Variable Models.” Advances in Neural Information Processing Systems (2016). link
  • Ross, Stéphane, Geoffrey J. Gordon, and Drew Bagnell. “A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning.” AISTATS (2011). link
  • Sun, Wen, et al. “Learning to filter with predictive state inference machines.”arXiv preprint arXiv:1512.08836 (2015). link
  • Sussillo, David, et al. “LFADS-Latent Factor Analysis via Dynamical Systems.” arXiv preprint arXiv:1608.06315 (2016). link
  • Bastani, Vahid et al. “Incremental Nonlinear System Identification and Adaptive Particle Filtering Using Gaussian Process.” arXiv preprint arXiv:1608.08362 (2016) link
  • Shestopaloff, Alexander Y., and Radford M. Neal. “MCMC for non-linear state space models using ensembles of latent sequences.” arXiv preprint arXiv:1305.0320 (2013). link
  • Kandasamy, Kirthevasan, Maruan Al-Shedivat, and Eric Xing. “Learning HMMs with Nonparametric Emissions via Spectral Decompositions of Continuous Matrices” arXiv preprint arXiv:1609.06390 (2016) link

Density Estimation

  • Tabak, E. G., and Cristina V. Turner. “A family of nonparametric density estimation algorithms.” Communications on Pure and Applied Mathematics (2013): 145-164. link
  • van den Oord, Aaron, Nal Kalchbrenner, and Koray Kavukcuoglu. “Pixel Recurrent Neural Networks.” Proceedings of The 33rd International Conference on Machine Learning. 2016. link
  • Dinh, Laurent, Jascha Sohl-Dickstein, and Samy Bengio. “Density estimation using Real NVP.” arXiv preprint arXiv:1605.08803 (2016). link

Reinforcement Learning

  • Levine, Sergey, et al. “End-to-end training of deep visuomotor policies.” Journal of Machine Learning Research 17.39 (2016): 1-40. link
  • Tamar, Aviv, Sergey Levine, and Pieter Abbeel. “Value Iteration Networks.” NIPS (2016). link
  • Finn, Chelsea, Ian Goodfellow, and Sergey Levine. “Unsupervised Learning for Physical Interaction through Video Prediction.” arXiv preprint arXiv:1605.07157 (2016). link
  • Agarwal, Alekh, et al. “Corralling a Band of Bandit Algorithms.” arXiv preprint arXiv:1612.06246 (2016). link

Statistical Neuroscience

  • Marblestone, Adam, Greg Wayne, and Konrad Kording. “Towards an integration of deep learning and neuroscience.” arXiv preprint arXiv:1606.03813 (2016). link
  • Gershman, Samuel J., Eric J. Horvitz, and Joshua B. Tenenbaum. “Computational rationality: A converging paradigm for intelligence in brains, minds, and machines.” Science 349.6245 (2015): 273-278. link

Previously Discussed in CAMLS

  • Lu, Xiaoyu, et al. “Relativistic Monte Carlo.” arXiv preprint arXiv:1609.04388 (2016). link
  • Y. Ma, T. Chen, and E.B. Fox, “A Complete Recipe for Stochastic Gradient MCMC,” Neural Information Processing Systems (NIPS) (2015). link
  • Rezende, Danilo Jimenez, and Shakir Mohamed. “Variational inference with normalizing flows.” arXiv preprint arXiv:1505.05770 (2015). link
  • Kingma, Diederik P., Tim Salimans, and Max Welling. “Improving Variational Inference with Inverse Autoregressive Flow.” arXiv preprint arXiv:1606.04934 (2016). link
  • Moreno, Alexander, et al. “Automatic Variational ABC.” arXiv preprint arXiv:1606.08549 (2016). link
  • Meeds, Edward, Robert Leenders, and Max Welling. “Hamiltonian ABC.” arXiv preprint arXiv:1503.01916 (2015). link
  • Meeds, Edward, and Max Welling. “GPS-ABC: Gaussian process surrogate approximate Bayesian computation.” arXiv preprint arXiv:1401.2838 (2014). link
  • Johnson, Matthew J., et al. “Composing graphical models with neural networks for structured representations and fast inference.” arXiv preprint arXiv:1603.06277 (2016). link
  • Polloc, Murray et al. “The Scalable Langevin Exact Algorithm: Bayesian Inference for Big Data” arXiv preprint arXiv:1609.03436 (2016). link
  • Advani, Madhu, and Surya Ganguli. “Statistical Mechanics of Optimal Convex Inference in High Dimensions.” Physical Review X 6.3_ (2016): 031034. link
  • Kawaguchi, Kenji. “Deep Learning without Poor Local Minima.” arXiv preprint arXiv:1605.07110 (2016). link
  • Mei, Song, Yu Bai, and Andrea Montanari. “The Landscape of Empirical Risk for Non-convex Losses.” arXiv preprint arXiv:1607.06534 (2016). link
  • Goodfellow, Ian, et al. “Generative adversarial nets.” Neural Information Processing Systems (NIPS) (2014). link
  • Nowozin, Sebastian, Botond Cseke, and Ryota Tomioka. “f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization.” arXiv preprint arXiv:1606.00709 (2016). link
  • Huang, Gao, et al. “Deep networks with stochastic depth.” arXiv preprint arXiv:1603.09382 (2016). link
  • Gal, Yarin, and Zoubin Ghahramani. “Dropout as a Bayesian approximation: Representing model uncertainty in deep learning.” arXiv preprint arXiv:1506.02142 (2015). link
  • Gregor, Karol, et al. “DRAW: A recurrent neural network for image generation.” arXiv preprint arXiv:1502.04623 (2015). link
  • Xu, Kelvin, et al. “Show, attend and tell: Neural image caption generation with visual attention.” arXiv preprint arXiv:1502.03044 2.3 (2015): 5. link
  • Eslami, S. M., et al. “Attend, Infer, Repeat: Fast Scene Understanding with Generative Models.” arXiv preprint arXiv:1603.08575 (2016). link
  • Gao, Yuanjun, et al. “Linear dynamical neural population models through nonlinear embeddings.” arXiv preprint arXiv:1605.08454 (2016). link
  • Krishnan, Rahul G., Uri Shalit, and David Sontag. “Deep Kalman Filters.”arXiv preprint arXiv:1511.05121 (2015). link
  • Uria, Benigno, et al. “Neural Autoregressive Distribution Estimation.” arXiv preprint arXiv:1605.02226 (2016). link
  • Germain, Mathieu, et al. “MADE: masked autoencoder for distribution estimation.” International Conference on Machine Learning. 2015. link
  • Jang, Eric et al. “Categorical Reparameterization with Gumbel-softmax.” (2016) link
  • Maddison, Chris J et al. “The concrete distribution: a continuous relaxation of discrete random variables.” (2016) link
  • Kusner, Matt J. and José Miguel Hernández-Lobato. “GANs for Sequences of Discrete Elements with the Gumbel-softmax Distribution.” (2016) link
  • Khan, Mohammad E., et al. “Kullback-Leibler proximal variational inference.” Advances in Neural Information Processing Systems. 2015. link
  • Khan, Mohammad E., et al. “Faster stochastic variational inference using Proximal-Gradient methods with general divergence functions.” arXiv preprint arXiv:1511.00146 (2015). link
  • Rezende, Danilo Jimenez, et al. “One-Shot Generalization in Deep Generative Models.” arXiv preprint arXiv:1603.05106 (2016). link
  • Lake, Brenden M., Ruslan Salakhutdinov, and Joshua B. Tenenbaum. “Human-level concept learning through probabilistic program induction.” Science 350.6266 (2015): 1332-1338. link
  • Kakade, Sham, et al. “Prediction with a Short Memory.” arXiv preprint arXiv:1612.02526 (2016). link