Jul. 21, 2018
Jun. 25, 2018

Differentiable Architecture Search

Code accompanying the paper

DARTS: Differentiable Architecture Search
Hanxiao Liu, Karen Simonyan, Yiming Yang.


The algorithm is based on continuous relaxation and gradient descent in the architecture space. It is able to efficiently design high-performance convolutional architectures for image classification (on CIFAR-10 and ImageNet) and recurrent architectures for language modeling (on Penn Treebank and WikiText-2). Only a single GPU is required.


Python >= 3.5.5, PyTorch == 0.3.1, torchvision == 0.2.0

NOTE: PyTorch 0.4 is not supported at this moment and would lead to OOM.


Instructions for acquiring PTB and WT2 can be found here. While CIFAR-10 can be automatically downloaded by torchvision, ImageNet needs to be manually downloaded (preferably to a SSD) following the instructions here.

Architecture Search

To carry out architecture search (using 2nd order approximation), run

cd cnn && python --unrolled     # for conv cells on CIFAR-10
cd rnn && python --unrolled     # for recurrent cells on PTB

Snapshots of the most likely convolutional & recurrent cells over time:

Architecture Evaluation

To reproduce our results using the best cells, run

cd cnn && python --auxiliary --cutout            # CIFAR-10
cd rnn && python                                 # PTB
cd rnn && python --data ../data/wikitext-2 \     # WT2
            --dropouth 0.15 --emsize 700 --nhidlast 700 --nhid 700 --wdecay 5e-7
cd cnn && python --auxiliary            # ImageNet

Customized architectures are supported through the --arch flag once specified in


