# Upcoming

** When: ** March 30, 2017 2-4pm

** Where: ** Jerome Greene Science Building - 3rd Floor Conf. Room

** Presenter: ** Gabriel

** Scribe: ** Francois

van den Oord, Aaron, Nal Kalchbrenner, and Koray Kavukcuoglu. “Pixel Recurrent Neural Networks.” Proceedings of The 33rd International Conference on Machine Learning. 2016. link

Dinh, Laurent, Jascha Sohl-Dickstein, and Samy Bengio. “Density estimation using Real NVP.” To appear at The 5th International Conference on Learning Representations. (2017). link

** When: ** April 6, 2017 2-4pm

** Where: ** Jerome Greene Science Building - 3rd Floor Conf. Room

** Presenter: ** Peter

** Scribe: ** Ian

TBD

# Convolutional Architectures for Value Iteration and Video Prediction

This week Robin led our discussion of two papers - “Value Iteration Networks” by Tamar et al., which won Best Paper at NIPS 2016, and “Unsupervised learning for physical interaction through video prediction” by Finn et al, also from NIPS 2016. The former introduces a novel connection between convolutional architectures and the value iteration algorithm of reinforcement learning, and presents a model that generalizes better to new tasks. The latter introduces a number of architectures for video prediction. A common theme in both papers is the exploitation of local structure in the problem in order to simplify the resulting calculations.

Continue reading# Modified GANs

In this week’s session we read and discussed two papers relating to GANs: Wasserstein GAN (Arjovsky et al. 2017 [1]) and Adversarial Variatonal Bayes: Unifying Variational Autoencoders and Generative Adversarial Networks (Mescheder et al. 2017 [4]). The first paper introduces the use of the Wasserstein distance rather than KL divergence for optimization in order to counter some of the problems faced in original GANs. The second paper synthesizes GANs with VAEs in an effort to allow arbitrarily complex inference models.

Continue reading# Prediction with a Short Term Memory

Last Thursday, Andrew presented a paper by Kakade et al. in which the problem of predicting the next observation given a sequence of past observations is studied. In particular, they study how far off a Markov model is from the optimal predictor. For a long time, simple Markov models were the state of the art for this task and have now been beat by Long Short-Term Memory neural networks. The paper tries to figure out why it took so long to beat the Markov Models. An interesting comment that was made pointed out that a Markov model of order has exponentially many parameters while LSTM networks don’t have that many parameters.

Continue reading