Tricks used in deep learning. Including papers read recently.
Gumbel-Softmax: Categorical Reparameterization with Gumbel-Softmax
Confidence penalty: Regularizing Neural Networks by Penalizing Confident Output Distributions
weight normalization: Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks
Batch Renormalization: Batch Renormalization: Towards Reducing Minibatch Dependence in Batch-Normalized Models
Soft weight-sharing for Neural Network compression
Wasserstein GAN:my implementation Example on MNIST
Loss-Sensitive Generative Adversarial Networks on Lipschitz Densities
Attentive Recurrent Comparators code
Decoupled Neural Interfaces using Synthetic Gradients
Variational Dropout Sparsifies Deep Neural Networks code
Sobolev Training for Neural Networks
Universal Language Model Fine-tuning for Text Classification
mixup: Beyond Empirical Risk Minimization
Random Erasing Data Augmentation
Manifold Mixup: Encouraging Meaningful On-Manifold Interpolation as a Regularizer
Neural Ordinary Differential Equations
An Intriguing Failing of Convolutional Neural Networks and the CoordConv Solution
All you need is attention