This is a list of suggested papers to choose from, loosely organized by topic.
Inference
- Mnih, Andriy, and Danilo J. Rezende. “Variational inference for Monte Carlo objectives.” arXiv preprint arXiv:1602.06725 (2016). link
- Kingma, Diederik P., and Max Welling. “Auto-encoding variational Bayes.”arXiv preprint arXiv:1312.6114 (2013). link
- Rhee, Chang-han, and Peter W. Glynn. “Unbiased estimation with square root convergence for SDE models.” Operations Research 63.5 (2015): 1026-1043. link
- Tripuraneni, Nilesh, et al. “Magnetic Hamiltonian Monte Carlo.” arXiv preprint arXiv:1607.02738 (2016). link
- Grosse, Roger B., Siddharth Ancha, and Daniel M. Roy. “Measuring the reliability of MCMC inference with bidirectional Monte Carlo.” arXiv preprint arXiv:1606.02275 (2016). link
- Kucukelbir, Alp, et al. “Automatic Differentiation Variational Inference.” arXiv preprint arXiv:1603.00788 (2016). link
- Duvenaud, David, Dougal Maclaurin, and Ryan P. Adams. “Early Stopping as Nonparametric Variational Inference.” AISTATS (2016). link
- Rudolph, Maja R., et al. “Exponential Family Embeddings.” arXiv preprint arXiv:1608.00778 (2016). link
- Bouchard-Côté, Alexandre, Sebastian J. Vollmer, and Arnaud Doucet. “The Bouncy Particle Sampler: A Non-Reversible Rejection-Free Markov Chain Monte Carlo Method.” arXiv preprint arXiv:1510.02451 (2015). link
- Pakman, Ari, et al. “Stochastic Bouncy Particle Sampler.” arXiv preprint arXiv:1609.00770 (2016). link
- Giles, Mike, et al. “Multilevel Monte Carlo for Scalable Bayesian Computations.” arXiv preprint arXiv:1609.06144 (2016). link
- Lopez-Paz, David, and Maxime Oquab. “Revisiting Classifier Two-Sample Tests.” arXiv preprint arXiv:1610.06545 (2016). link
- He, Niao, et al. “Fast and Simple Optimization for Poisson Likelihood Models.” arXiv preprint arXiv:1608.01264 (2016). link
Theory
- Arora, Sanjeev, et al. “Generalization and Equilibrium in Generative Adversarial Nets (GANs).” arXiv preprint arXiv:1703.00573 (2017). link
- Hardt, Moritz, Tengyu Ma, and Benjamin Recht. “Gradient Descent Learns Linear Dynamical Systems.” arXiv preprint arXiv:1609.05191 (2016). link
- Yang, Fanny, Sivaraman Balakrishnan, and Martin J. Wainwright. “Statistical and computational guarantees for the Baum-Welch algorithm.” 53rd Annual Allerton Conference on Communication, Control, and Computing. IEEE (2015). link
- Chen, Yudong, and Martin J. Wainwright. “Fast low-rank estimation by projected gradient descent: General statistical and algorithmic guarantees.”arXiv preprint arXiv:1509.03025 (2015). link
- Wibisono, Andre, Ashia C. Wilson, and Michael I. Jordan. “A Variational Perspective on Accelerated Methods in Optimization.” arXiv preprint arXiv:1603.04245 (2016). link
- Chi, Jin et al. “Local Maxima in the Likelihood of Gaussian Mixture Models: Structural Results and Algorithmic Consequences.” arXiv preprint arXiv:1609.00978 (2016) link
- Bloem-Reddy, Benjamin, and Peter Orbanz. “Random Walk Models of Network Formation and Sequential Monte Carlo Methods for Graphs.” arXiv preprint arXiv:1612.06404 (2016). link
- Huszár, Ferenc. “How (not) to Train your Generative Model: Scheduled Sampling, Likelihood, Adversary?.” arXiv preprint arXiv:1511.05101 (2015). link
- Mohamed, Shakir, and Balaji Lakshminarayanan. “Learning in Implicit Generative Models.” arXiv preprint arXiv:1610.03483 (2016). link
Deep Learning
- Gulrajani, Ishaan, et al. “Improved Training of Wasserstein GANs.” arXiv preprint arXiv:1704.00028 (2017). link
- Metz, Luke, et al. “Unrolled Generative Adversarial Networks.” arXiv preprint arXiv:1611.02163 (2016). link
- Radford, Alec, Luke Metz, and Soumith Chintala. “Unsupervised representation learning with deep convolutional generative adversarial networks.” arXiv preprint arXiv:1511.06434 (2015). link
- Graves, Alex, et al. “Hybrid computing using a neural network with dynamic external memory.” Nature 538.7626 (2016): 471-476. link
- Chung, Junyoung, et al. “A recurrent latent variable model for sequential data.” Advances in neural information processing systems. 2015. link
- Burda, Yuri, Roger Grosse, and Ruslan Salakhutdinov. “Importance weighted autoencoders.” arXiv preprint arXiv:1509.00519 (2015). link
- He, Kaiming, et al. “Deep residual learning for image recognition.” arXiv preprint arXiv:1512.03385 (2015). link
- Rezende, Danilo Jimenez, et al. “Unsupervised Learning of 3D Structure from Images.” arXiv preprint arXiv:1607.00662 (2016). link
- Bowman, Samuel R., et al. “Generating sentences from a continuous space.”arXiv preprint arXiv:1511.06349 (2015). link
- Dosovitskiy, Alexey, et al. “Learning to Generate Chairs, Tables and Cars with Convolutional Networks.” (2016). link
- Maaløe, Lars, et al. “Auxiliary Deep Generative Models.” arXiv preprint arXiv:1602.05473 (2016). link
- Mnih, Volodymyr, Nicolas Heess, and Alex Graves. “Recurrent models of visual attention.” Advances in Neural Information Processing Systems. 2014. link
- Makhzani, Alireza, et al. “Adversarial autoencoders.” arXiv preprint arXiv:1511.05644 (2015). link
- Gregor, Karol, et al. “Deep autoregressive networks.” arXiv preprint arXiv:1310.8499 (2013). link
- Nguyen, Anh, et al. “Synthesizing the preferred inputs for neurons in neural networks via deep generator networks.” arXiv preprint arXiv:1605.09304 (2016). link
- Kiros, Ryan, et al. “Skip-thought vectors.” Neural Information Processing Systems (NIPS). (2015). link
- Mansimov, Elman, et al. “Generating images from captions with attention.”arXiv preprint arXiv:1511.02793 (2015). link
- Salimans, Tim, et al. “Improved Techniques for Training GANs.” arXiv preprint arXiv:1606.03498 (2016). link
- Nalisnick, Eric, and Padhraic Smyth. “Deep Generative Models with Stick-Breaking Priors.” arXiv preprint arXiv:1605.06197 (2016). link
- Kulkarni, Tejas D., et al. “Deep convolutional inverse graphics network.” Neural Information Processing Systems (NIPS) (2015). link
- Gatys, Leon A., Alexander S. Ecker, and Matthias Bethge. “A neural algorithm of artistic style.” arXiv preprint arXiv:1508.06576 (2015). link
- Alain, Guillaume, and Yoshua Bengio. “Understanding intermediate layers using linear classifier probes.” arXiv preprint arXiv:1610.01644 (2016). link
- Chen, Yutian, et al. “Learning to Learn for Global Optimization of Black Box Functions.” arXiv preprint arXiv:1611.03824 (2016). link
- Rezende, Danilo Jimenez, Shakir Mohamed, and Daan Wierstra. “Stochastic Backpropagation and Approximate Inference in Deep Generative Models.” Proceedings of The 31st International Conference on Machine Learning. 2014. link
State Space Models
- Foerster, Jakob N., et al. “Intelligible language modeling with input switched affine networks.” arXiv preprint arXiv:1611.09434 (2016). link
- Fraccaro, Marco, et al. “Sequential Neural Models with Stochastic Layers.” Advances in Neural Information Processing Systems (2016). link
- Chung, Junyoung, et al. “A recurrent latent variable model for sequential data.” Advances in neural information processing systems. 2015. link
- Schein, Aaron, Hanna Wallach, and Mingyuan Zhou. “Poisson-Gamma dynamical systems.” Advances in Neural Information Processing Systems (2016). link
- Winner, Kevin, and Daniel R. Sheldon. “Probabilistic Inference with Generating Functions for Poisson Latent Variable Models.” Advances in Neural Information Processing Systems (2016). link
- Ross, Stéphane, Geoffrey J. Gordon, and Drew Bagnell. “A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning.” AISTATS (2011). link
- Sun, Wen, et al. “Learning to filter with predictive state inference machines.”arXiv preprint arXiv:1512.08836 (2015). link
- Sussillo, David, et al. “LFADS-Latent Factor Analysis via Dynamical Systems.” arXiv preprint arXiv:1608.06315 (2016). link
- Bastani, Vahid et al. “Incremental Nonlinear System Identification and Adaptive Particle Filtering Using Gaussian Process.” arXiv preprint arXiv:1608.08362 (2016) link
- Shestopaloff, Alexander Y., and Radford M. Neal. “MCMC for non-linear state space models using ensembles of latent sequences.” arXiv preprint arXiv:1305.0320 (2013). link
- Kandasamy, Kirthevasan, Maruan Al-Shedivat, and Eric Xing. “Learning HMMs with Nonparametric Emissions via Spectral Decompositions of Continuous Matrices” arXiv preprint arXiv:1609.06390 (2016) link
Density Estimation
- Tabak, E. G., and Cristina V. Turner. “A family of nonparametric density estimation algorithms.” Communications on Pure and Applied Mathematics (2013): 145-164. link
- van den Oord, Aaron, Nal Kalchbrenner, and Koray Kavukcuoglu. “Pixel Recurrent Neural Networks.” Proceedings of The 33rd International Conference on Machine Learning. 2016. link
- Dinh, Laurent, Jascha Sohl-Dickstein, and Samy Bengio. “Density estimation using Real NVP.” arXiv preprint arXiv:1605.08803 (2016). link
Reinforcement Learning
- Levine, Sergey, et al. “End-to-end training of deep visuomotor policies.” Journal of Machine Learning Research 17.39 (2016): 1-40. link
- Tamar, Aviv, Sergey Levine, and Pieter Abbeel. “Value Iteration Networks.” NIPS (2016). link
- Finn, Chelsea, Ian Goodfellow, and Sergey Levine. “Unsupervised Learning for Physical Interaction through Video Prediction.” arXiv preprint arXiv:1605.07157 (2016). link
- Agarwal, Alekh, et al. “Corralling a Band of Bandit Algorithms.” arXiv preprint arXiv:1612.06246 (2016). link
Statistical Neuroscience
- Marblestone, Adam, Greg Wayne, and Konrad Kording. “Towards an integration of deep learning and neuroscience.” arXiv preprint arXiv:1606.03813 (2016). link
- Gershman, Samuel J., Eric J. Horvitz, and Joshua B. Tenenbaum. “Computational rationality: A converging paradigm for intelligence in brains, minds, and machines.” Science 349.6245 (2015): 273-278. link
Previously Discussed in CAMLS
- Lu, Xiaoyu, et al. “Relativistic Monte Carlo.” arXiv preprint arXiv:1609.04388 (2016). link
- Y. Ma, T. Chen, and E.B. Fox, “A Complete Recipe for Stochastic Gradient MCMC,” Neural Information Processing Systems (NIPS) (2015). link
- Rezende, Danilo Jimenez, and Shakir Mohamed. “Variational inference with normalizing flows.” arXiv preprint arXiv:1505.05770 (2015). link
- Kingma, Diederik P., Tim Salimans, and Max Welling. “Improving Variational Inference with Inverse Autoregressive Flow.” arXiv preprint arXiv:1606.04934 (2016). link
- Moreno, Alexander, et al. “Automatic Variational ABC.” arXiv preprint arXiv:1606.08549 (2016). link
- Meeds, Edward, Robert Leenders, and Max Welling. “Hamiltonian ABC.” arXiv preprint arXiv:1503.01916 (2015). link
- Meeds, Edward, and Max Welling. “GPS-ABC: Gaussian process surrogate approximate Bayesian computation.” arXiv preprint arXiv:1401.2838 (2014). link
- Johnson, Matthew J., et al. “Composing graphical models with neural networks for structured representations and fast inference.” arXiv preprint arXiv:1603.06277 (2016). link
- Polloc, Murray et al. “The Scalable Langevin Exact Algorithm: Bayesian Inference for Big Data” arXiv preprint arXiv:1609.03436 (2016). link
- Advani, Madhu, and Surya Ganguli. “Statistical Mechanics of Optimal Convex Inference in High Dimensions.” Physical Review X 6.3_ (2016): 031034. link
- Kawaguchi, Kenji. “Deep Learning without Poor Local Minima.” arXiv preprint arXiv:1605.07110 (2016). link
- Mei, Song, Yu Bai, and Andrea Montanari. “The Landscape of Empirical Risk for Non-convex Losses.” arXiv preprint arXiv:1607.06534 (2016). link
- Goodfellow, Ian, et al. “Generative adversarial nets.” Neural Information Processing Systems (NIPS) (2014). link
- Nowozin, Sebastian, Botond Cseke, and Ryota Tomioka. “f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization.” arXiv preprint arXiv:1606.00709 (2016). link
- Huang, Gao, et al. “Deep networks with stochastic depth.” arXiv preprint arXiv:1603.09382 (2016). link
- Gal, Yarin, and Zoubin Ghahramani. “Dropout as a Bayesian approximation: Representing model uncertainty in deep learning.” arXiv preprint arXiv:1506.02142 (2015). link
- Gregor, Karol, et al. “DRAW: A recurrent neural network for image generation.” arXiv preprint arXiv:1502.04623 (2015). link
- Xu, Kelvin, et al. “Show, attend and tell: Neural image caption generation with visual attention.” arXiv preprint arXiv:1502.03044 2.3 (2015): 5. link
- Eslami, S. M., et al. “Attend, Infer, Repeat: Fast Scene Understanding with Generative Models.” arXiv preprint arXiv:1603.08575 (2016). link
- Gao, Yuanjun, et al. “Linear dynamical neural population models through nonlinear embeddings.” arXiv preprint arXiv:1605.08454 (2016). link
- Krishnan, Rahul G., Uri Shalit, and David Sontag. “Deep Kalman Filters.”arXiv preprint arXiv:1511.05121 (2015). link
- Uria, Benigno, et al. “Neural Autoregressive Distribution Estimation.” arXiv preprint arXiv:1605.02226 (2016). link
- Germain, Mathieu, et al. “MADE: masked autoencoder for distribution estimation.” International Conference on Machine Learning. 2015. link
- Jang, Eric et al. “Categorical Reparameterization with Gumbel-softmax.” (2016) link
- Maddison, Chris J et al. “The concrete distribution: a continuous relaxation of discrete random variables.” (2016) link
- Kusner, Matt J. and José Miguel Hernández-Lobato. “GANs for Sequences of Discrete Elements with the Gumbel-softmax Distribution.” (2016) link
- Khan, Mohammad E., et al. “Kullback-Leibler proximal variational inference.” Advances in Neural Information Processing Systems. 2015. link
- Khan, Mohammad E., et al. “Faster stochastic variational inference using Proximal-Gradient methods with general divergence functions.” arXiv preprint arXiv:1511.00146 (2015). link
- Rezende, Danilo Jimenez, et al. “One-Shot Generalization in Deep Generative Models.” arXiv preprint arXiv:1603.05106 (2016). link
- Lake, Brenden M., Ruslan Salakhutdinov, and Joshua B. Tenenbaum. “Human-level concept learning through probabilistic program induction.” Science 350.6266 (2015): 1332-1338. link
- Kakade, Sham, et al. “Prediction with a Short Memory.” arXiv preprint arXiv:1612.02526 (2016). link