Deep Learning  The Straight Dope
Abstract
This repo contains an
incremental sequence of notebooks designed to teach deep learning, MXNet, and
the gluon
interface. Our goal is to leverage the strengths of Jupyter
notebooks to present prose, graphics, equations, and code together in one place.
If we're successful, the result will be a resource that could be simultaneously
a book, course material, a prop for live tutorials, and a resource for
plagiarising (with our blessing) useful code. To our knowledge there's no source
out there that teaches either (1) the full breadth of concepts in modern deep
learning or (2) interleaves an engaging textbook with runnable code. We'll find
out by the end of this venture whether or not that void exists for a good
reason.
Another unique aspect of this book is its authorship process. We are developing this resource fully in the public view and are making it available for free in its entirety. While the book has a few primary authors to set the tone and shape the content, we welcome contributions from the community and hope to coauthor chapters and entire sections with experts and community members. Already we've received contributions spanning typo corrections through full working examples.
Implementation with Apache MXNet
Throughout this book,
we rely upon MXNet to teach core concepts, advanced topics, and a full
complement of applications. MXNet is widely used in production environments
owing to its strong reputation for speed. Now with gluon
, MXNet's new
imperative interface (alpha), doing research in MXNet is easy.
Dependencies
To run these notebooks, you'll want to build MXNet from source. Fortunately, this is easy (especially on Linux) if you follow these instructions. You'll also want to install Jupyter and use Python 3 (because it's 2017).
Slides
The authors (& others) are increasingly giving talks that are based on the content in this books. Some of these slidedecks (like the 6hour KDD 2017) are gigantic so we're collecting them separately in this repo. Contribute there if you'd like to share tutorials or course material based on this books.
Translation
As we write the book, large stable sections are simultaneously being translated into 中文, available in a web version and via GitHub source.
Table of contents
Part 1: Deep Learning Fundamentals

Chapter 1: Crash course

Chapter 2: Introduction to supervised learning
 Linear regression (from scratch)
 Linear regression (with
gluon
)  Binary classification with logistic regression (
gluon
w bespoke loss function)  Multiclass logistic regression (from scratch)
 Multiclass logistic regression (with
gluon
)  Overfitting and regularization (from scratch)
 Overfitting and regularization (with
gluon
)  Perceptron and SGD primer
 Learning environments

Chapter 3: Deep neural networks (DNNs)
 Multilayer perceptrons (from scratch)
 Multilayer perceptrons (with
gluon
)  Dropout regularization (from scratch)
 Dropout regularization (with
gluon
)  Introduction to
gluon.Block
andgluon.nn.Sequential()
 Writing custom layers with
gluon.Block
 Serialization: saving and loading models
 Advanced Data IO
 Debugging your neural networks

Chapter 4: Convolutional neural networks (CNNs)

Chapter 5: Recurrent neural networks (RNNs)
 Simple RNNs (from scratch)
 LSTMS RNNs (from scratch)
 GRUs (from scratch)
 RNNs (with
gluon
)  Roadmap Dropout for recurrent nets
 Roadmap Zoneout regularization

Chapter 6: Optimization
 Introduction to optimization
 Gradient descent and stochastic gradient descent from scratch
 Gradient descent and stochastic gradient descent with
gluon
 Momentum from scratch
 Momentum with
gluon
 Adagrad from scratch
 Adagrad with
gluon
 RMSprop from scratch
 RMSprop with
gluon
 Adadelta from scratch
 Adadelta with
gluon
 Adam from scratch
 Adam with
gluon

Chapter 7: Distributed & highperformance learning
 Fast & flexible: combining imperative & symbolic nets with HybridBlocks
 Training with multiple GPUs (from scratch)
 Training with multiple GPUs (with
gluon
)  Training with multiple machines
 Roadmap Asynchronous SGD
 Roadmap Elastic SGD
Part 2: Applications

Chapter 8: Computer vision (CV)
 Roadmap Network of networks (inception & co)
 Roadmap Residual networks
 Object detection
 Roadmap Fullyconvolutional networks
 Roadmap Siamese (conjoined?) networks
 Roadmap Embeddings (pairwise and triplet losses)
 Roadmap Inceptionism / visualizing feature detectors
 Roadmap Style transfer
 Visualquestionanswer
 Finetuning

Chapter 9: Natural language processing (NLP)
 Roadmap Word embeddings (Word2Vec)
 Roadmap Sentence embeddings (SkipThought)
 Roadmap Sentiment analysis
 Roadmap Sequencetosequence learning (machine translation)
 Roadmap Sequence transduction with attention (machine translation)
 Roadmap Named entity recognition
 Roadmap Image captioning
 TreeLSTM for semantic relatedness

Chapter 10: Audio processing
 Roadmap Intro to automatic speech recognition
 Roadmap Connectionist temporal classification (CSC) for unaligned sequences
 Roadmap Combining static and sequential data

Chapter 11: Recommender systems
 Introduction to recommender systems
 Roadmap Latent factor models
 Roadmap Deep latent factor models
 Roadmap Bilinear models
 Roadmap Learning from implicit feedback

Chapter 12: Time series
 Introduction to Forecasting (with
gluon
)  Generalized Linear Models/MLP for Forecasting (with
gluon
)  Roadmap Factor Models for Forecasting
 Roadmap Recurrent Neural Network for Forecasting
 Linear Dynamical System (from scratch)
 Exponential Smoothing and Innovative Statespace modeling (from scratch)
 Roadmap Gaussian processes for Forecasting
 Roadmap Bayesian Time Series Models
 Roadmap Modeling missing data
 Roadmap Combining static and sequential data
 Introduction to Forecasting (with
Part 3: Advanced Methods

Chapter 13: Unsupervised learning
 Roadmap Introduction to autoencoders
 Roadmap Convolutional autoencoders (introduce upconvolution)
 Roadmap Denoising autoencoders
 Roadmap Variational autoencoders
 Roadmap Clustering

Chapter 14: Generative adversarial networks (GANs)
 Introduction to GANs
 Deep convolutional GANs (DCGANs)
 Roadmap WassersteinGANs
 Roadmap Energybased GANS
 Roadmap Conditional GANs
 Image transduction GANs (Pix2Pix)
 Roadmap Learning from Synthetic and Unsupervised Images

Chapter 15: Adversarial learning
 Roadmap Two Sample Tests
 Roadmap Finding adversarial examples
 Roadmap Adversarial training

Chapter 16: Tensor Methods
 Introduction to tensor methods
 Roadmap Tensor decomposition
 Roadmap Tensorized neural networks

Chapter 17: Deep reinforcement learning (DRL)
 Roadmap Introduction to reinforcement learning
 Roadmap Deep contextual bandits
 Deep Qnetworks (DQN)
 DoubleDQN
 Roadmap Policy gradient
 Roadmap Actorcritic gradient

Chapter 18: Variational methods and uncertainty
 Roadmap Dropoutbased uncertainty estimation (BALD)
 Weight uncertainty (Bayes by Backprop) from scratch
 Weight uncertainty (Bayes by Backprop) with
gluon
 Weight uncertainty (Bayes by Backprop) for Recurrent Neural Networks
 Roadmap Variational autoencoders
Appendices
 Appendix 1: Cheatsheets
 Roadmap
gluon
 Roadmap PyTorch to MXNet (work in progress)
 Roadmap Tensorflow to MXNet
 Roadmap Keras to MXNet
 Roadmap Math to MXNet
 Roadmap
Choose your own adventure
We've designed these tutorials so that you can traverse the curriculum in more than one way.
 Anarchist  Choose whatever you want to read, whenever you want to read it.
 Imperialist  Proceed through all tutorials in order. In this fashion you will be exposed to each model first from scratch, writing all the code ourselves but for the basic linear algebra primitives and automatic differentiation.
 Capitalist  If you don't care how things work (or already know) and just want to see working code in
gluon
, you can skip (from scratch!) tutorials and go straight to the productionlike code using the highlevelgluon
front end.
Authors
This evolving creature is a collaborative effort (see contributors tab). The lead writers, assimilators, and coders include:
 Zachary C. Lipton (@zackchase)
 Mu Li (@mli)
 Alex Smola (@smolix)
 Sheng Zha (@szha)
 Aston Zhang (@astonzhang)
 Joshua Z. Zhang (@zhreshold)
 Eric Junyuan Xie (@piiswrong)
 Kamyar Azizzadenesheli (@kazizzad)
 Jean Kossaifi (@JeanKossaifi)
 Stephan Rabanser (@steverab)
Inspiration
In creating these tutorials, we've have drawn inspiration from some the resources that allowed us to learn deep / machine learning with other libraries in the past. These include:
 Soumith Chintala's Deep Learning with PyTorch: A 60 Minute Blitz
 Alec Radford's Barebones intro to Theano
 Video of Alec's intro to deep learning with Theano
 Chris Bishop's Pattern Recognition and Machine Learning
Contribute
 Already, in the short time this project has been off the ground, we've gotten some helpful PRs from the community with pedagogical suggestions, typo corrections, and other useful fixes. If you're inclined, please contribute!